NVIDIA / nv-wavenet

Reference implementation of real-time autoregressive wavenet inference
BSD 3-Clause "New" or "Revised" License
735 stars 126 forks source link

The method 'namedBarrierSync'? #3

Closed DabiaoMa closed 6 years ago

DabiaoMa commented 6 years ago

As a beginner of GPU programming, I am not quite familiar with the common used cuda methods. I have checked the cuda programming guide and did not find any documents about the method 'namedBarrierSync', would you please give me some hints about this method? Thanks.

BrianPharris commented 6 years ago

namedBarrierSync is implemented in softmax.cuh (I should probably have picked a better place for it), as follows:

device forceinline void namedBarrierSync(int name, int numThreads) { asm volatile("bar.sync %0, %1;" : : "r"(name), "r"(numThreads) : "memory"); }

It allows us to use independent barriers for a subset of the threads in the block, rather than forcing all threads to synchronize with __syncthreads().

More information can be found at: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-bar

DabiaoMa commented 6 years ago

Thanks a lot