Open iomaganaris opened 3 years ago
I think there are a few different issues here, but some observations from local testing of something related:
https://github.com/BlueBrain/CoreNeuron/blob/df95ceaf1f0bffd2000b1942ca2ba4211e7a74b0/coreneuron/sim/fast_imem.cpp#L24-L39
mismatches ecalloc_align
(which wraps cudaMallocManaged
when CORENEURON_UNIFIED_MEMORY
is set) with free
-- we should use free_memory
instead, which forwards to cudaFree
when needed.
Probably both the NrnFastImem
and TrajectoryRequests
structs should inherit from MemoryManaged
, or we should otherwise make sure they are allocated in unified memory in these builds.
There is another issue with TrajectoryRequests::varrays
, which is allocated by NEURON, but which is assumed to have a device version that is writeable from the device: https://github.com/BlueBrain/CoreNeuron/blob/df95ceaf1f0bffd2000b1942ca2ba4211e7a74b0/coreneuron/sim/fadvance_core.cpp#L301
In unified memory builds, we would need to somehow swap in a unified memory buffer here and copy it to NEURON's buffer as needed.
Describe the issue Some of the NEURON test are failing on GPU when CUDA Unified Memory is enabled in CoreNEURON. More precisely the tests that fail are:
To Reproduce Steps to reproduce the behavior:
Expected behavior GPU tests should be passing with Unified Memory as well.
Logs An example of a failing test (
coreneuron_modtests::direct_py
) when run withcuda-memcheck
has the following output:The corresponding line that fails in the
stim.cpp
:System (please complete the following information)