The original CUDA threading model was to process one module per thread block for most kernels. Parallel algorithms don't currently allow to express this model; as single execution of a std::for_each lambda will map to one thread so we now have one thread per module.
There is still an issue when compiling gpuVertexFinder::loadTracks, nvc++ hangs and never finishes
Port of
plugin-PixelVertexFinding
tostd::par
.The original CUDA threading model was to process one module per thread block for most kernels. Parallel algorithms don't currently allow to express this model; as single execution of a
std::for_each
lambda will map to one thread so we now have one thread per module.There is still an issue when compiling
gpuVertexFinder::loadTracks
,nvc++
hangs and never finishes