This is meant to exaplain the main concepts of a GPU architecture that are relevant to GPU programming.
Base material on L11-Accelerated_Architectures.pptx.
For instance explaining:
presence of multiple SMs
High bandwidth, high latency . Hide latency with multiple threads and fast switching contexts
SIMT execution ( blocks are assigned to SMs, threads execute in warps/wavefronts )
overview of memory hierarchy: global memory, shared memory, caches
GPU and CPU have separate address spaces. Maybe mention unified shared memory ( Grace/Hopper, MI300 architecture )
This is meant to exaplain the main concepts of a GPU architecture that are relevant to GPU programming. Base material on
L11-Accelerated_Architectures.pptx
.For instance explaining:
presence of multiple SMs
High bandwidth, high latency . Hide latency with multiple threads and fast switching contexts
SIMT execution ( blocks are assigned to SMs, threads execute in warps/wavefronts )
overview of memory hierarchy: global memory, shared memory, caches
GPU and CPU have separate address spaces. Maybe mention unified shared memory ( Grace/Hopper, MI300 architecture )
Example architectures ( Volta, AMD MI200 )