ulysseB / telamon

A framework to find good combinations of optimizations for computational kernels on GPUs.
https://ulysseb.github.io/telamon/telamon
Apache License 2.0
23 stars 6 forks source link

[cuda] Allow non-coherent loads #286

Closed Elarnon closed 5 years ago

Elarnon commented 5 years ago

The memory model does not support non-coherent loads; however, they are mandatory to get a non-negligible performance boost on Maxwell and more recent architectures. As an intermediate measure to properly implementing them in the memory model, this patch allows using them, but ignores them in the performance model.