Open gmarkall opened 2 weeks ago
Perhaps ask all device functions to implement a method attribute, say, __numba_cuda_link__
that returns a list of files, if they want numba-cuda to handle the linking?
Continuing with the __numba_cuda_link__
idea, I think it might need to be a method that can accept a signature, so that it can return the appropriate files for the given signature.
How should the kernel author pass function arguments at the call site if it is a method not attribute?
In the example above, the kernel author wrote:
FFT(thread_data, shared_mem)
assuming the Numba types of these are float32[:]
and float32[::1]
(for the sake of argument, they could be any Numba type really) I'd expect during compilation time that Numba would be doing the equivalent of calling
ltoir = FFT.__numba_cuda_link__(float32[:], float32[::1])
where ltoir
is then an LTOIR
linkable code object, i.e. an instance of
Numba-cuda extensions (e.g. nvmath-python) are frequently leaning on CUDA C++ implementations to support the core of their functionality.
One current UX limitation is that the kernel author is required to add the list of files and/or code to link with a kernel as a keyword argument to the
@cuda.jit
decorator, for example:from cufftdx_simple_fft_block.py
The
FFT
object supplies the files, and is created like:and is called inside the kernel as:
Rather than the user being required to link
FFT.files
, Numba should provide a mechanism to obtain and link the list of files / code (LTO-IR, PTX, CUDA C/C++ source, or binaries / objects etc.) at the point of compilation and linking from theFFT
object (or any implementation of a method, property, object, etc. backed by an extension). It is expected that the implementation (ofFFT
, in this example) may generate code (e.g. LTO-IR) at this point just prior to returning it back to Numba.