rapidsai / raft

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.
https://docs.rapids.ai/api/raft/stable/
Apache License 2.0
680 stars 180 forks source link

ensure raft-dask wheel tests install pylibraft wheel from the same CI run, fix wheel dependencies #2349

Closed jameslamb closed 1 month ago

jameslamb commented 1 month ago

Description

Fixes #2348

2331 introduced rapids-build-backend (https://github.com/rapidsai/rapids-build-backend) as the build backend for pylibraft, raft-dask, and raft-ann-bench.

That library handles automatically modifying a wheel's dependencies based on the target CUDA version. Unfortunately, we missed a few cases in #2331, and as a result the last few days of nightly raft-dask wheels had the following issues:

This wasn't caught in raft's CI, but in downstream CI like cuml and cugraph, with errors like this:

ERROR: ResolutionImpossible:

The conflict is caused by:
    raft-dask-cu12 24.8.0a20 depends on pylibraft==24.8.* and >=0.0.0a0
    raft-dask-cu12 24.8.0a19 depends on pylibraft==24.8.* and >=0.0.0a0

(example cugraph build)

This PR:

Notes for Reviewers

What was the root cause of CI missing this, and how does this PR fix it?

The raft-dask test CI jobs use this pattern to install the raft-dask wheel built earlier in the CI pipeline.

pip install "raft_dask-cu12[test]>=0.0.0a0" --find-links dist/

As described in the pip docs (link), --find-links just adds a directory to the list of other places pip searches for packages. Because the wheel there had unsatisfiable constraints (e.g. pylibraft==24.8.* does not exist anywhere), pip install silently disregarded that locally-downloaded raft_dask wheel and backtracked (i.e. downloaded older and older wheels from https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/) until it found one that wasn't problematic.

This PR ensures that won't happen by telling pip to install exactly that locally-downloaded file, like this

pip install "$(echo ./dist/raft_dask_cu12*.whl)[test]"

If that file is uninstallable, pip install fails and you find out via a CI failure.

How I tested this

Initially pushed a commit with just the changes to the test script. Saw the wheel-tests-raft-dask CI jobs fail in the expected way, instead of silently falling back to an older wheel and passing 🎉 .

ERROR: Could not find a version that satisfies the requirement ucx-py-cu12==0.39.* (from raft-dask-cu12[test]) (from versions: 0.32.0, 0.33.0, 0.34.0, 0.35.0, 0.36.0, 0.37.0, 0.38.0a4, 0.38.0a5, 0.38.0a6, 0.39.0a0)
ERROR: No matching distribution found for ucx-py-cu12==0.39.*

(build link)

dantegd commented 1 month ago

@jameslamb @bdice This PR should unblock CI for downstream repos, and ready to merge, right?

bdice commented 1 month ago

@dantegd @jameslamb Yes, I think so. I’ll go ahead and merge so we can unblock downstream work.

bdice commented 1 month ago

/merge

jameslamb commented 1 month ago

Yes it should, thanks for merging it @bdice

nv-rliu commented 1 month ago

Thanks everyone! 👍