Closed h-vetinari closed 1 year ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
Stubs by definition are not supposed to work.
How is this done for native builds?
Also, the qemu env is used only for test
phase where compilers are not present. When compilers are present QEMU_LD_PREFIX
is set to the env where the compilers are present. How do you handle that case?
Also, the qemu env is used only for
test
phase where compilers are not present. When compilers are presentQEMU_LD_PREFIX
is set to the env where the compilers are present. How do you handle that case?
I only need it for the test phase when in emulation. The regular compilation ran fine with the current setup already.
How is this done for native builds?
Presumably through the cuda-version
run-export somehow? At least, the various feedstocks which already build natively for CUDA 12 are finding libcuda.so
during testing. I don't know QEMU well enough but it seems to be looking elsewhere (you added the copy to QEMU_LD_PREFIX
in https://github.com/conda-forge/conda-forge-ci-setup-feedstock/commit/ae8eb4eda3d5c9fee3b4711ebe59ee33e7cd0430, so I was trying to emulate that - no pun intended).
@isuruf, I now realized that we very likely don't have to do the manual copying we're doing for CUDA 11 -- as the location for QEMU is another conda environment anyway, we can just install the compat-libs there? Could you confirm that approach is reasonable?
Presumably through the cuda-version run-export somehow? At least, the various feedstocks which already build natively for CUDA 12 are finding libcuda.so during testing.
Can you find out how this is done natively? We should emulate that.
I don't know QEMU well enough but it seems to be looking elsewhere (you added the copy to QEMU_LD_PREFIX in https://github.com/conda-forge/conda-forge-ci-setup-feedstock/commit/ae8eb4eda3d5c9fee3b4711ebe59ee33e7cd0430, so I was trying to emulate that - no pun intended).
For CUDA 11, for non-QEMU, libcuda.so.1
is found in /usr/local/lib
in the docker image. Since it is not there in CUDA 12, non-QEMU, I'm just wondering where it is in native builds.
@isuruf: How is this done for native builds?
@h-vetinari: Presumably through the
cuda-version
run-export somehow? At least, the various feedstocks which already build natively for CUDA 12 are findinglibcuda.so
during testing.@isuruf: Can you find out how this is done natively? We should emulate that.
Looking at the testing phase for the last CI run for cupy against CUDA 12, I see:
cuda-cudart: 12.0.107-h59595ed_6 conda-forge
cuda-cudart_linux-64: 12.0.107-h59595ed_6 conda-forge
cuda-driver-dev: 12.0.107-h59595ed_6 conda-forge
cuda-driver-dev_linux-64: 12.0.107-h59595ed_6 conda-forge
cuda-nvrtc: 12.0.76-h59595ed_1 conda-forge
cuda-nvtx: 12.0.76-hcb278e6_0 conda-forge
cuda-version: 12.0-hffde075_2 conda-forge
cupy: 12.1.0-py310hfc31588_0 local
There's a lib/stubs/libcuda.so
in cuda-driver-dev
, which is also explicitly added as a test dependency in the recipe, with the comment # need the libcuda stub for import test
Perhaps @leofang @adibbley @jakirkham can comment on how finding libcuda.so
at test time should be handled more generally...
Will check if this works in https://github.com/conda-forge/arrow-cpp-feedstock/pull/1120. I'm also going to reactivate #236 whether we want to run CI for the CUDA cross-compilation setup (even though we don't need the builds...), we can continue there if someone wants to change how we deal with cuda-compat
vs. cuda-driver-dev
So it turns out that the stub is not enough (see https://github.com/conda-forge/arrow-cpp-feedstock/pull/1120). I've also tried adding cuda-driver-dev
and cuda-compat
to the test environment explicitly, but that didn't make a difference.
Can we revert this then?
I'll happily revert this once we know how we want to deal with this. In the meantime, my thinking was: it doesn't hurt anyone and I wanted to avoid further churn.
If you want I can open a dedicated issue here. Currently I have this comment which summarizes the situation from my POV - we should solve the libcuda-at-test-time divergence in a way that doesn't require every feedstock to add spurious test dependencies.
You merged this PR despite my review to test things out and since it doesn't work, reverting is the courteous thing to do.
No problem, if that's your preference (previously & elsewhere, you had been sensitive to churn): #254
You merged this PR despite my review to test things out and since it doesn't work
I did what you had asked (do it as elsewhere for CUDA 12), and to test it out I needed to merge it. Arrow is also the first feedstock (to my knowledge) trying to cross-compile with the CUDA 12 infrastructure, so there is nothing that can break from changes to this particular branch.
No, you didn't check CUDA 12 with arrow-cpp. I'm pretty sure, linux-64 native builds with CUDA 12 will break as well.
This PR didn't change anything about native compilation
Sure, that's why I think this is the wrong place to fix an issue you are seeing because the problem you are seeing is not limited to cross compilation.
because the problem you are seeing is not limited to cross compilation.
That could very well be. It did not immediately appear like that to me though, as the native cuda 12 migration had been running for a couple weeks now, and the arch one hasn't been started yet (there's also few feedstocks cross-compiling cuda at all, since that capability is quite recent).
My interest with this PR was purely in unblocking the cross-compiling case, without touching (and possibly disturbing) the native case.
Perhaps @leofang @adibbley @jakirkham can comment on how finding
libcuda.so
at test time should be handled more generally...
At least in CuPy/cuQuantum feedstocks, my finding is that we need LD_PRELOAD
or dlopen
to load the stub file prior to import:
Otherwise, the linker would complain. This is unique to the CUDA 12 builds, where we no longer have the stubs searchable in the path; CUDA 11 builds (which, as Isuru mentioned, uses the docker image) work fine.
My interest with this PR was purely in unblocking the cross-compiling case, without touching (and possibly disturbing) the native case.
@adibbley or @jakirkham can correct me, but last time I checked CUDA 12 for aarch64/ppc64le is not ready yet. I am not sure if the linux64 packages are enough for cross compiling? (I guess not, but I need to recollect the reason.)
Follow-up to #250, where I originally wanted to copy but then we decided to just not fail. However, the tests in https://github.com/conda-forge/arrow-cpp-feedstock/pull/1120 show that the stubs are still necessary in QEMU.
superseded
Seeing that the script is already copying `compat/*` for CUDA 11... https://github.com/conda-forge/conda-forge-ci-setup-feedstock/blob/17fd53570288308428591cd8c6b04c95a8a70367/recipe/cross_compile_support.sh#L114-L115 ... it looks like copying the [content](https://github.com/conda-forge/cuda-compat-feedstock/blob/fda8055cff9ec301681da0bcbf0b2f4efbcf7f57/recipe/meta.yaml#L55-L64) of `cuda-compat` should be the right equivalent for the new CUDA 12 setup. Last three commits will be removed again if the CI passes.