I noticed that alpaka (and alpakatest) was crashing with
Processing 1000 events, of which 2 concurrently, with 2 threads.
terminate called after throwing an instance of 'std::runtime_error'
what(): .../pixeltrack-standalone/external/alpaka/include/alpaka/event/EventUniformCudaHipRt.hpp(157) 'cudaEventRecord( event.m_spEventImpl->m_UniformCudaHipEvent, queue.m_spQueueImpl->m_UniformCudaHipQueue)' returned error : 'cudaErrorInvalidResourceHandle': 'invalid resource handle'!
when run on 2 GPUs. I managed to trace the problem into the pinned host memory allocations in CachingAllocator, when a pinned host memory block was previously used on device A, and is then used on device B.
The current code does
// Take previous device from *the argument block*
auto const& previousDevice = block.device();
// Take the queue of the argument block
auto queue = std::move(*(block.queue));
// Copy the information of the block to be reused
block = iBlock->second;
// Assign the queue originally from the argument block
block.queue = std::move(queue);
Since the block.device() returns the device of its Queue, this means that both block.device() and previousDevice point to the same device, and the subsequent code to create a new Event never runs.
but given that the point is to create a new Event if the device of the Queue accessing the block differs from the device of the Event that the block had, I think testing explicitly that condition makes the intention of the code more clear.
I noticed that
alpaka
(andalpakatest
) was crashing withwhen run on 2 GPUs. I managed to trace the problem into the pinned host memory allocations in CachingAllocator, when a pinned host memory block was previously used on device A, and is then used on device B.
The current code does
Since the
block.device()
returns the device of itsQueue
, this means that bothblock.device()
andpreviousDevice
point to the same device, and the subsequent code to create a newEvent
never runs.A minimal fix would be
but given that the point is to create a new
Event
if the device of theQueue
accessing the block differs from the device of theEvent
that the block had, I think testing explicitly that condition makes the intention of the code more clear.