rapidsai / build-planning

Tracking for RAPIDS-wide build tasks
https://github.com/rapidsai
0 stars 3 forks source link

Remove uses of 'pip --find-links' in CI #69

Closed jameslamb closed 2 months ago

jameslamb commented 3 months ago

Description

Most RAPIDS projects have separate build jobs (which produce artifacts like wheels and conda packages) and test jobs (which install those artifacts and then run tests using the installed versions).

Some of those test jobs for wheels do something like this (pseudocode) to install those build-job artifacts:

download-from-s3 project.whl > ./dist/
pip install "project" --find-links ./dist

As we discovered in https://github.com/rapidsai/raft/pull/2348, that use of --find-links can result in some types of failures being missed in CI.

For example, if that wheel in ./dist has impossible-to-resolve dependencies, pip install will happily disregard it and backtrack to older wheels on https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/.

From "Finding Packages' in the pip docs (link)

pip looks for packages in a number of places: on PyPI ..., in the local filesystem, and in any additional repositories specified via --find-links or --index-url. There is no ordering in the locations that are searched. ...they are all checked, and the “best” match for the requirements ... is selected.

Wherever possible, use of --find-links should be avoided across RAPIDS CI jobs, in favor of asking pip to directly install the just-downloaded wheel(s) like this:

pip install ./dist/*.whl

Benefits of this work

Acceptance Criteria

Approach

Use GitHub search to look for uses of the following across RAPIDS:

Whenever possible, replace those by pointing pip install directly at the files to consider, e.g. like this:

# globbing is enough if not using [extras]
pip install ./dist/*

# echo the full name for use with [extras]
pip install "$(echo ./dist/*.whl)[test]"

Build jobs might require a different approach from test jobs

You'll find some uses like this one in rmm, where a wheel of some dependency is downloaded and then needed at build time.

CPP_WHEELHOUSE=$(
    RAPIDS_PY_WHEEL_NAME="rmm_${RAPIDS_PY_CUDA_SUFFIX}" \
    rapids-download-wheels-from-s3 cpp /tmp/librmm_dist
)

pip wheel \
    -w dist \
    --no-deps \
    --find-links "${CPP_WHEELHOUSE}" \
    .

(rmm - ci/build_wheel.sh)

That use of --find-links is there because RAPIDS CI build jobs tend to use build isolation. Test out whether it's possible to additionally pass specific wheel files to pip wheel --constraint to force pip wheel to use them.

python -m pip wheel \
    -w dist \
    -vvv \
    --no-deps \
    --disable-pip-version-check \
    --find-links "${CPP_WHEELHOUSE}" \
    --constraint "${CPP_WHEELHOUSE}/*.whl" \
    .

Something like that might work to force pip to use that file, even in the isolated build environment.

But if not, it's fine to leave the build jobs alone, in exchange for the benefits we get from using build isolation.

Notes

Relevant discussions:

vyasr commented 3 months ago

This is fundamentally the pip analog to #14. I moved us towards --find-links in some recent PRs because I thought it would better align our pip and conda solves, but you're right that it leads to pip installs having the same issues as the conda solve issues. It is quite tricky to come up with a good solution here though. Maybe you could do a [pip|conda] install --no-deps of the downloaded artifacts followed by an installation of the package list produced by dfg? The order here will matter, but I don't know which order will produce a cleaner result; in one case you might have to force upgrading, while in the other you'd have to prevent it.

As far as build isolation, it's worth noting that we might have to do away with that anyway in light of C++ wheels and the sccache issues that come with using this in ephemeral isolated build environments.

jameslamb commented 3 months ago

Thanks for that! I hadn't considered the connection between this idea and #14.

It is quite tricky to come up with a good solution here though

At a minimum, for wheel test jobs I think the pip install dist/*.whl pattern is unambiguously stricter and therefore preferable to the pip install --find-links ./dist pattern.

But yeah, the case for build jobs (at least as long as we want build isolation) is tough.

vyasr commented 3 months ago

Well like I said we might have to get rid of build isolation so we can figure that part out when we get done with C++ wheels.

jameslamb commented 3 months ago

Was experimenting with approaches for this today, and I think I found one that'll give us a strong guarantee of correctness for build dependencies even while using build isolation.

It appears that --constraint passed to pip wheel doesn't affect the creation of the build environment... but setting environment variable PIP_CONSTRAINT does.

So, given some wheel(s) downloaded like this...

RAPIDS_PY_CUDA_SUFFIX="$(rapids-wheel-ctk-name-gen ${RAPIDS_CUDA_VERSION})"
CPP_WHEELHOUSE=$(RAPIDS_PY_WHEEL_NAME="rmm_${RAPIDS_PY_CUDA_SUFFIX}" rapids-download-wheels-from-s3 cpp /tmp/librmm_dist)

... instead of this (the current pattern used in rmm, just as an example)...

python -m pip wheel \
    --no-deps \
    -w dist \
    --find-links "${CPP_WHEELHOUSE}" \
    .

... I think we could force pip to use exactly the wheel that was just downloaded like this:

echo "librmm @ file://$(echo ${CPP_WHEELHOUSE}/librmm_*.whl)" > ./constraints.txt

PIP_CONSTRAINT="${PWD}/constraints.txt" \
python -m pip wheel \
    --no-deps \
    -w dist \
    .

That worked for me in some local experiments using simple pure-Python projects. I think I could test it with rmm by publishing a librmm wheel (CI artifact only) with outright-broken runtime dependencies. If we see build of rmm wheels fail, we'll know that it tried to use exactly that broken librmm. If they succeed, then that'd probably be because pip wheel ignored the broken local wheel and fell back to a working one (the thing we're trying to avoid).

I'll try that some time soon.

jameslamb commented 3 months ago

This worked! Using the PIP_CONSTRAINT environment variable appears to be an effective way to force pip to install a particular wheel into the isolated build environment.

See "How I tested this" here: https://github.com/rapidsai/rmm/pull/1586

I've updated the tasklist at the top of this issue with the remaining places across RAPIDS where PIP_FIND_LINKS or pip install --find-links are used. I'll wait until some of yall have a chance to review this, but if there are no objections then I'll roll that out to the other repos and I think we should use that pattern to help with testing during the rest of the C++ wheels development (#33).

cc @vyasr @bdice @msarahan @KyleFromNVIDIA

vyasr commented 2 months ago

This behavior makes sense. I'm pretty sure you'll observe the same behavior for all of the CLI flags vs environment variables (e.g. --find-links or --extra-index-url) where only the environment variable affects pip's behavior when constructing the build environment because the command line arguments only affect the runtime dependency installation.

Thanks for doing this research! I'm fine moving forward with this approach, although I'll reiterate that it might be a short-lived change if we end up having to remove build isolation anyway when we migrate to C++ wheels.

jameslamb commented 2 months ago

the command line arguments only affect the runtime dependency installation

Yeah I think you're right. This is the behavior I observed but it wasn't necessarily what I expected. Either way, the environment variable approach seems to work well and I think it'll continue to for as long as we're using build isolation.

I'm fine moving forward with this approach, although I'll reiterate that it might be a short-lived change if we end up having to remove build isolation anyway

Ok thanks! Since it's not too much effort to implement this and since only a few of the projects currently are using --find-links for their build environments, I'm going to implement this for those projects. I think the low amount of effort is worth it in exchange for the extra strictness in CI, for as long as we continue to use build isolation.

vyasr commented 2 months ago

This is the behavior I observed but it wasn't necessarily what I expected.

For sure, I was surprised when I first found this too.

jameslamb commented 2 months ago

I believe this is complete, and can be closed.

I have open PRs up in cugraph-gnn and cugraph-pg (see task list), but those projects don't have CI set up right now so in my opinion we don't need to hold this issue open waiting for those to be merged.

I've checked the rapidsai org and don't see any other uses:

I've also reviewed the open C++ wheels PRs (https://github.com/rapidsai/build-planning/issues/33) and left comments on any that are using --find-links / PIP_FIND_LINKS encouraging the use of this constraints-based pattern there:

jakirkham commented 2 months ago

Thanks James! 🙏