rapidsai / dask-build-environment

Build environments for various dask related projects on gpuCI
4 stars 11 forks source link

Unpin `numpy` in Dask environment, leave `dask` pinned in Dask-SQL environment #88

Closed charlesbluca closed 8 months ago

charlesbluca commented 8 months ago

Unblocks remaining image build failures after the merging of https://github.com/dask-contrib/dask-sql/pull/1314:

jakirkham commented 8 months ago

Yeah we needed to bump to NumPy 1.23 as part of supporting Python 3.11 ( https://github.com/rapidsai/build-planning/issues/3#issuecomment-1967952280 )

Should add recently we added an upper bound to NumPy in preparation for NumPy 2 ( https://github.com/rapidsai/build-planning/issues/29 ). We plan to look into relaxing this, but need time to test (and potentially fix issues that come up). So the pin gives us breathing room

jakirkham commented 8 months ago

Also looking upstream it appears Dask uses NumPy 1.22 with Python 3.9 tests. Idk if that is an issue for us, so mentioning just in case

Other environments seem to use NumPy 1.23 or newer (some are unconstrained)

charlesbluca commented 8 months ago

Also looking upstream it appears Dask uses NumPy 1.22 with Python 3.9 tests. Idk if that is an issue for us, so mentioning just in case

Yup that's essentially what this PR is aiming to work around by manually unpinning in the Dockerfile - typically these unpins are aimed at the 3.9 (or lowest minor Python version) environment where some of the core dependencies are often pinned in ways that are incompatible with RAPIDS.

Checking locally, this is sufficient to unblock the Dask builds, but I notice that there are still some issues around the dask pinning in dask-sql's builds:

https://gpuci.gpuopenanalytics.com/job/dask/job/dask-build-environment/job/branch/job/dask-build-env-main/BUILD_NAME=dask_sql,CUDA_VER=11.8.0,LINUX_VER=ubuntu20.04,PYTHON_VER=3.10,RAPIDS_VER=24.02/1058/console

With rapids-dask-dependency and dask-sql both pinned to 2024.1.1 for now, we shouldn't need to toy around with dask's pinning at all, so can make that change in this PR so we can wrap up all the build issues at once

jakirkham commented 8 months ago

Thanks Charles! 🙏

Makes sense

Do we want a NumPy 2 upper bound here?

charlesbluca commented 8 months ago

Do we want a NumPy 2 upper bound here?

I don't think we need it to unblock, as right now just loosening the explicit pinning means we should fall back on whatever upper bound gets set by RAPIDS, though it might be worth adding pinnings to the environments here on some core packages like numpy, pandas, etc. that we can align with RAPIDS pinnings as they change to make it immediately obvious when things have stopped working (versus right now when we're only alerted when conda is unable to solve some environment).

Expect to have more time to think about this post-GTC, but can file an issue here around this work for now to follow up on in coming weeks

jakirkham commented 8 months ago

Ok sure that makes. Happy to go with what we have here

Let's raise an issue to follow up on RAPIDS alignment pins for numpy, pandas, etc.

charlesbluca commented 8 months ago

Cool, did a quick write up of my initial thoughts here https://github.com/rapidsai/dask-build-environment/issues/89

jakirkham commented 8 months ago

Looks good. Thanks Charles! 🙏

Ready to merge?

charlesbluca commented 8 months ago

Yup! Can follow up here once the builds are rerun and succeed, thanks for the help @jakirkham!

jakirkham commented 8 months ago

Thanks Charles! 🙏

charlesbluca commented 8 months ago

Looks like that unblocked builds - thanks! 🙏🏼