NVIDIA / numba-cuda

BSD 2-Clause "Simplified" License
36 stars 8 forks source link

#48 rebased: "Allow JIT Compile to, and Link from, LTOIR for cuda source input" #60

Closed gmarkall closed 3 weeks ago

gmarkall commented 1 month ago

Following the merge of #23 / #56, this is #48 rebased on main for review.

The changes here should be the same as those in #48, except:

cc @isVoid

isVoid commented 1 month ago

Discussed with @gmarkall offline. This PR is actually enabling lto for CUSource inputs, and a test should be added for that. Besides, because numba-cuda does not hard depend on pynvjitlink, and that pynvjitlink is required to enable LTO. We should keep LTO disabled by default. User can enable LTO via cuda.jit(lto=True). This control is still important because Numba users can be kernel developers that wants fine grain control over the compilation. In the future, we should consider using a global config to set LTO default values.