Open isVoid opened 1 month ago
I'm not able to reproduce this segfault on my v100 machine:
test_namedunituple (numba.cuda.tests.cudapy.test_array_args.TestCudaArrayArg.test_namedunituple) ... Fatal Python error: Segmentation fault
...
Extension modules: numpy._core._multiarray_umath, numpy._core._multiarray_tests, numpy.linalg._umath_linalg, numba.core.typeconv._typeconv, numpy.random._common, numpy.random.bit_generator, numpy.random._bounded_integers, numpy.random._mt19937, numpy.random.mtrand, numpy.random._philox, numpy.random._pcg64, numpy.random._sfc64, numpy.random._generator, numba._helperlib, numba._dynfunc, numba._dispatcher, numba.core.runtime._nrt_python, numba.np.ufunc._internal, numba.experimental.jitclass._box, numba.mviewbuf, pynvjitlink._nvjitlinklib, numba.types.itertools (total: 22)
ci/test_conda_pynvjitlink.sh: line 72: 2238 Segmentation fault (core dumped) ENABLE_PYNVJITLINK=1 NUMBA_CUDA_TEST_BIN_DIR=$NUMBA_CUDA_TEST_BIN_DIR python -m numba.runtests numba.cuda.tests -v
/__w/numba-cuda/numba-cuda
In afcce87 I added an additional flag ignore_nonlto
to the linker so that only LTO-able objects are added when the flag is enabled. This allows the driver is in the correct state when -ptx
flag is set. And this is also the desired behavior since Numba now dumps the optimized PTX only for the portion that are LTO-abled added to the linker, and raise warning for any source that aren't optimizable.
A subtle case here is that lto=True
is only enabled for cuda>12.0
and is tested so. Because this feature depends on pynvjitlink, which is only tested in CTK12.5 environment.
This PR supercedes #60 due to write permission issue.