Open muellch opened 4 years ago
CUDA has vector loads that allow each thread to access for example 128 bit vectors. The task is to evaluate the possible benefits of using these vector loads in the backend of our stencil dialect.
CUDA has vector loads that allow each thread to access for example 128 bit vectors. The task is to evaluate the possible benefits of using these vector loads in the backend of our stencil dialect.