AnyDSL / thorin

The Higher-Order Intermediate Representation
https://anydsl.github.io
GNU Lesser General Public License v3.0
151 stars 15 forks source link

Add cuda_device_arch #133

Closed michael-kenzel closed 1 year ago

michael-kenzel commented 1 year ago

This little hack allows us to expose the CUDA target architecture as a device function. It exploits the fact that all relevant CUDA C++ compilers need to work with the nvvm libdevice library and, thus, implement transformations around __nvvm_reflect which we can use to get at the target architecture.

tested and seems to work with NVCC, NVRTC, and Clang

michael-kenzel commented 1 year ago

found better approach -> closed