Closed regro-cf-autotick-bot closed 3 years ago
Hi! This is the friendly automated conda-forge-linting service.
I just wanted to let you know that I linted all conda-recipes in your PR (recipe
) and found it was in an excellent condition.
This doesn't seem right. Will take a look tonight. Also, we'll need to do a manual migration for the rc branch.
@conda-forge-admin, please rerender
OK most of the build matrix is restored, but CUDA 9.2 is still missing, likely because after https://github.com/conda-forge/nccl-feedstock/pull/32 we no longer build NCCL for CUDA 9.2.
@jakirkham Is that an oversight, or because NCCL 2.8.x no longer supports CUDA 9.2? If it still supports, I think for all CUDA libraries (cuDNN, NCCL, cuTENSOR, etc) I expect to see them built for all CUDA versions, starting from 9.2. Downstream libraries are free to cut the build matrix (based on the migrator), but for "infrastructure" libraries they should be there for as long as possible, just like cudatoolkit. In this case I can send a PR to nccl-feedstock for spinning up 9.2 again.
Is that an oversight, or because NCCL 2.8.x no longer supports CUDA 9.2?
OK at least the official releases (https://developer.nvidia.com/nccl/nccl-legacy-downloads) cut 9.2 out for quite some time. Here is the last supported version list for older CUDAs (which are also set in CuPy's CI):
I will try pinning to these version pairs.
...Now that I think carefully, it's odd that only 9.2 is cut. The latest NCCL doesn't build for 10.0/10.1 either, but here they are.
It seems to have been produced
Yeah it's out before the CUDA 111/112 migration that cut the build matrix.
Yep we lucked out in terms of PR ordering
Anyways it seems like they still include the CUDA 9 gencodes. So I don't think they dropped it (unless that is an oversight), but they may have stopped building binaries
I am debugging CBC locally and noticed something fishy. How come we generated the combination of cos6 + CUDA 11.0 here...It's invalid.
EDIT: I meant "cos6" + CUDA 11.0.
I can't make sense of what's wrong. We see the following symptoms:
These are irrelevant of this NCCL migration PR. I tested a simple rerender on the current master and it happens too. Apparently after the local CBC is applied, some other migrators follow and mess things up, but I can't tell which one makes trouble. Any advice @conda-forge/core?
OK I figured it out. Isuru's advice applies once again here: DO NOT USE (CUDA) MIGRATORS if managing the CBC by hand. In this case removing cuda110.yaml
restores a sane state. Will fix it shortly.
OK I figured it out. Isuru's advice applies once again here: DO NOT USE (CUDA) MIGRATORS if managing the CBC by hand. In this case removing
cuda110.yaml
restores a sane state. Will fix it shortly.
@leofang one note in that the nccl package doesn't have a cbc file so it's currently only being built for the CUDA versions in the global pinning. We may need to add the versions that are needed by CuPy and other packages.
one note in that the nccl package doesn't have a cbc file so it's currently only being built for the CUDA versions in the global pinning. We may need to add the versions that are needed by CuPy and other packages.
Thanks, Keith! I think John found that we have the version needed for CUDA 9.2 (https://github.com/conda-forge/cupy-feedstock/pull/102#issuecomment-786408949). Let's see if the CI is happy with it or not. 🙂
That was an older build number which suffices for now, but once a new NCCL version is released and migrators are issued it will cause issues.
If you look at _2
builds here: https://anaconda.org/conda-forge/nccl/files?version=2.8.4.1 you'll see there's only 11.2, 11.1, 11.0, and 10.2 builds.
We may need to add the versions that are needed by CuPy and other packages.
That was an older build number which suffices for now, but once a new NCCL version is released and migrators are issued it will cause issues.
Yeah I see what you're saying. I think we have two solutions when this happens:
If the CI is happy I can handle No.2 later; otherwise, I will send PRs to nccl/cudnn feedstocks to fix them first. (cuTENSOR is good because it just doesn't support older CUDA versions.)
I believe cuDNN dropped support for older than CUDA 10.2 as well.
...right, my head is scrambled now 🤯 So in that case we should manually pin an older cudnn in the recipe here.
So in that case we should manually pin an older cudnn in the recipe here.
I think for cudnn it's alright, because we zip it with cuda versions and pin it loosely.
Looks like we'll need a migrator for cuDNN 8.1 + CUDA 10.2, but this can be done separately.
Merge this to unblock the migrator and version updates. Will handle any necessary changes in another PRs.
Thanks for working on this Leo and assisting Keith! 😄
This PR has been triggered in an effort to update nccl_2_8_4_1.
Notes and instructions for merging this PR:
Please note that if you close this PR we presume that the feedstock has been rebuilt, so if you are going to perform the rebuild yourself don't close this PR until the your rebuild has been merged.
This package has the following downstream children:
And potentially more.
If this PR was opened in error or needs to be updated please add the
bot-rerun
label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase code>@<space/conda-forge-admin, please rerun bot in a PR comment to have theconda-forge-admin
add it for you.This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. If you would like a local version of this bot, you might consider using rever. Rever is a tool for automating software releases and forms the backbone of the bot's conda-forge PRing capability. Rever is both conda (
conda install -c conda-forge rever
) and pip (pip install re-ver
) installable. Finally, feel free to drop us a line if there are any issues! This PR was generated by https://github.com/regro/autotick-bot/actions/runs/601037069, please use this URL for debugging