Restricting gencodes used per CUDA version

conda-forge / nccl-feedstock

A conda-smithy repository for nccl.

BSD 3-Clause "New" or "Revised" License

4 stars 16 forks source link

Restricting gencodes used per CUDA version #39

Closed jakirkham closed 1 year ago

jakirkham commented 3 years ago

AIUI NCCL includes a larger range of gencodes going back to CUDA 8 and even includes older CUDA versions when building for newer CUDA versions. To cutdown on binary size and speed up builds, we might consider using a more narrow set of gencodes for each CUDA version

kkraus14 commented 3 years ago

Even though they named it that way it really corresponds to hardware architectures. The way it's currently laid out in NCCL is Kepler and newer architecture support. This sounds reasonable to me and we shouldn't change it.

kkraus14 commented 3 years ago

If we could have a virtual package in conda that gives us the GPU architecture that we could use for getting a separate package that could be an option, but it would make conda environments less portable (though the same thing happens on the CPU side with AVX / SSE / etc anyway).

robertmaynard commented 3 years ago

The usage of CUDA9_PTX and CUDA9_GENCODE for CUDA_MAJOR >= 11 is interesting. Might be better to have CUDA9_PTX map to -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75

leofang commented 2 years ago

I am doing this (restricting gencodes) for aarch64 to avoid timeout (#60) and I'd like to revisit this discussion.

leofang commented 1 year ago

Now that we've switched to cross compiling, the build time is significantly reduced, let me make a judgment call and close this issue. We can revisit as needed.