dzhoshkun / cuda-learning

BSD 3-Clause "New" or "Revised" License
1 stars 0 forks source link

Compare kernel throughput to peak theoretical throughput #10

Open dzhoshkun opened 6 years ago

dzhoshkun commented 6 years ago

The CUDA performance guidelines state:

comparing the floating-point operation throughput or memory throughput - whichever makes more sense - of a particular kernel to the corresponding peak theoretical throughput of the device indicates how much room for improvement there is for the kernel.

dzhoshkun commented 6 years ago

Also helpful from the CUDA best practices guide: &