rapidsai / dask-build-environment

Build environments for various dask related projects on gpuCI
4 stars 11 forks source link

Consider adding pinnings of core RAPIDS dependencies to environment files #89

Open charlesbluca opened 7 months ago

charlesbluca commented 7 months ago

There is often a lag between RAPIDS core dependencies getting updated and build failures cropping up in this repo, which is typically due to conda allowing older less restrictive RAPIDS nightlies in that hide the underlying build issues.

For example, consider https://github.com/rapidsai/ucx-py/pull/1028, which bumps the latest UCX-Py nightlies to require NumPy 1.23+. On paper, this should immediately cause issues for the Dask 3.9 image builds, which pull in an environment that specifies numpy=1.22. In practice, these build errors don't crop up immediately because given these conflicting pinnings, conda will just go for the newest available nightly that allows for 1.22, giving us an environment that is "wrong" (has an outdated version UCX-Py) but doesn't raise any flags to alert us of this. Once the older, less restrictive nightlies are removed from rapidsai-nightly, the build errors do arise, but by now we may have already sunk some time into unexpected test behavior caused by testing against this "wrong" environment.

I think it would be useful to define what core RAPIDS dependencies are pulled in each Dask GPU CI environment, and to add these packages with their RAPIDS pinnings to each GPU CI environment file. Then by keeping these pinnings aligned with RAPIDS, we can ensure that more restrictive pinnings are immediately enforced by conda in the builds, versus being conditional on the status of rapidsai-nightly.

Really the biggest challenge here is figuring out how to implement that alignment - I definitely think that this is something better handled automatically than through manual updates, maybe through a metapackage or some centralized conda environment spec that could be conda-merge'd with those defined in this repo and per-Dask project?

pentschev commented 7 months ago

I would pick the list of packages from dependecies.yaml in the Dask-CUDA/UCX-Py(UCXX in the future) repos, those and the integrations repo should be groundtruth for versions and dependencies. I certainly agree an automated approach is the right way but I don't have any additional ideas to your proposal above.