Open jasperzhong opened 3 years ago
https://www.youtube.com/watch?v=4APkMJdiudU&list=PLC6u37oFvF40BAm7gwVP7uDdzmW83yHPe
CPU和GPU区别总结的挺精辟的.
如何利用massive number of GPU cores.
概念:
线程的组织:
grid/block有dimension,可以是1D/2D/3D.
https://www.youtube.com/watch?v=OSpy-HoR0ac&list=PLC6u37oFvF40BAm7gwVP7uDdzmW83yHPe&index=5
Memory model
这节讲的很好...excellent!
注意local memory是很慢的..比shared memory慢很多. 这是因为local memory是off-chip的.
图中绿色的部分是NVIDIA cores所在的地方,上面的memory是on-chip的. 蓝色部分是DRAM,是off-chip.
https://www.youtube.com/watch?v=PJCISyoGpug&list=PLC6u37oFvF40BAm7gwVP7uDdzmW83yHPe&index=6
讲了synchronization primitives.
https://github.com/NVIDIA/cuda-samples
简直是宝藏.