ComputeCanada / wheels_builder

10 stars 2 forks source link

add code to set the lower-than version of numpy required #113

Closed mboisson closed 2 months ago

mboisson commented 3 months ago

I would like to get more eyes on this code, make sure I have not forgotten some cases..

Here is what it gives for the tf_models_official wheel:

$ ./manipulate_wheels.py --print_req --wheel tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl | grep numpy
numpy >=1.20

$ ./manipulate_wheels.py --set_lt_numpy 2.0 --inplace --force --wheel tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl
Since --force was used, overwriting existing wheel
New wheel created tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl

$ ./manipulate_wheels.py --print_req --wheel tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl | grep numpy
numpy (>=1.20,<2.0)
mboisson commented 3 months ago

The case where a ; is specified (to have specifications on version of python, for example) is not correctly handled:

Noodles-0.3.3+computecanada-py3-none-any.whl
numpy ; extra == 'develop'
numpy ; extra == 'numpy'
h5py ; extra == 'numpy'
filelock ; extra == 'numpy'
Since --force was used, overwriting existing wheel
New wheel created Noodles-0.3.3+computecanada-py3-none-any.whl
numpy (; extra == 'develop',<2.0)
numpy (; extra == '',<2.0)
h5py ; extra == 'numpy'
filelock ; extra == 'numpy'
mboisson commented 3 months ago

After commit 258b62b, this is the result on a handful of wheels with various combinations of numpy requirements:

for w in *.whl; do echo "Before, $w:"; ./manipulate_wheels.py --print_req --wheel $w | grep numpy; echo "Adding <2.0:"; ./manipulate_wheels.py --set_lt_numpy 2.0 --force --inplace --wheel $w; echo "After, $w:"; ./manipulate_wheels.py --print_req --wheel $w | grep numpy; echo "==========================="; done
Before, Noodles-0.3.3+computecanada-py3-none-any.whl:
numpy ; extra == 'develop'
numpy ; extra == 'numpy'
h5py ; extra == 'numpy'
filelock ; extra == 'numpy'
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created Noodles-0.3.3+computecanada-py3-none-any.whl
After, Noodles-0.3.3+computecanada-py3-none-any.whl:
numpy (<2.0) ;  extra == 'develop'
numpy (<2.0) ;  extra == 'numpy'
h5py ; extra == 'numpy'
filelock ; extra == 'numpy'
===========================
Before, Shapely-1.8.2+computecanada-cp310-cp310-linux_x86_64.whl:
numpy ; extra == 'all'
numpy ; extra == 'vectorized'
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created Shapely-1.8.2+computecanada-cp310-cp310-linux_x86_64.whl
After, Shapely-1.8.2+computecanada-cp310-cp310-linux_x86_64.whl:
numpy (<2.0) ;  extra == 'all'
numpy (<2.0) ;  extra == 'vectorized'
===========================
Before, Shapely-1.8.5.post1+computecanada-cp38-cp38-linux_x86_64.whl:
numpy ; extra == 'all'
numpy ; extra == 'vectorized'
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created Shapely-1.8.5.post1+computecanada-cp38-cp38-linux_x86_64.whl
After, Shapely-1.8.5.post1+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (<2.0) ;  extra == 'all'
numpy (<2.0) ;  extra == 'vectorized'
===========================
Before, cftime-1.5.0+computecanada-cp37-cp37m-linux_x86_64.whl:
numpy
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created cftime-1.5.0+computecanada-cp37-cp37m-linux_x86_64.whl
After, cftime-1.5.0+computecanada-cp37-cp37m-linux_x86_64.whl:
numpy (<2.0)
===========================
Before, h5py-3.4.0+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (>=1.14.5) ; python_version == "3.7"
numpy (>=1.17.5) ; python_version == "3.8"
numpy (>=1.19.3) ; python_version >= "3.9"
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created h5py-3.4.0+computecanada-cp38-cp38-linux_x86_64.whl
After, h5py-3.4.0+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (>=1.14.5,<2.0) ;  python_version == "3.7"
numpy (>=1.17.5,<2.0) ;  python_version == "3.8"
numpy (>=1.19.3,<2.0) ;  python_version >= "3.9"
===========================
Before, opencv_python_headless-4.6.0.66+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (>=1.21) ; python_version < "3.7"
numpy (>=1.21.2) ; python_version >= "3.10"
numpy (>=1.21.2) ; python_version >= "3.6" and platform_system == "Darwin" and platform_machine == "arm64"
numpy (>=1.21) ; python_version >= "3.6" and platform_system == "Linux" and platform_machine == "aarch64"
numpy (>=1.21) ; python_version >= "3.7"
numpy (>=1.21) ; python_version >= "3.8"
numpy (>=1.21) ; python_version >= "3.9"
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created opencv_python_headless-4.6.0.66+computecanada-cp38-cp38-linux_x86_64.whl
After, opencv_python_headless-4.6.0.66+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (>=1.21,<2.0) ;  python_version < "3.7"
numpy (>=1.21.2,<2.0) ;  python_version >= "3.10"
numpy (>=1.21.2,<2.0) ;  python_version >= "3.6" and platform_system == "Darwin" and platform_machine == "arm64"
numpy (>=1.21,<2.0) ;  python_version >= "3.6" and platform_system == "Linux" and platform_machine == "aarch64"
numpy (>=1.21,<2.0) ;  python_version >= "3.7"
numpy (>=1.21,<2.0) ;  python_version >= "3.8"
numpy (>=1.21,<2.0) ;  python_version >= "3.9"
===========================
Before, pandas-1.5.3+computecanada-cp310-cp310-linux_x86_64.whl:
numpy (>=1.21) ; python_version < "3.10"
numpy (>=1.21.0) ; python_version >= "3.10"
numpy (>=1.23.2) ; python_version >= "3.11"
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created pandas-1.5.3+computecanada-cp310-cp310-linux_x86_64.whl
After, pandas-1.5.3+computecanada-cp310-cp310-linux_x86_64.whl:
numpy (>=1.21,<2.0) ;  python_version < "3.10"
numpy (>=1.21.0,<2.0) ;  python_version >= "3.10"
numpy (>=1.23.2,<2.0) ;  python_version >= "3.11"
===========================
Before, pandas-2.0.3+computecanada-cp39-cp39-linux_x86_64.whl:
numpy>=1.21
numpy>=1.21
numpy>=1.21
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created pandas-2.0.3+computecanada-cp39-cp39-linux_x86_64.whl
After, pandas-2.0.3+computecanada-cp39-cp39-linux_x86_64.whl:
numpy (>=1.21,<2.0)
numpy (>=1.21,<2.0)
numpy (>=1.21,<2.0)
===========================
Before, procgen-0.10.4+416821e.computecanada-cp38-cp38-linux_x86_64.whl:
numpy (<2.0.0,>=1.17.0)
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created procgen-0.10.4+416821e.computecanada-cp38-cp38-linux_x86_64.whl
After, procgen-0.10.4+416821e.computecanada-cp38-cp38-linux_x86_64.whl:
numpy (<2.0,>=1.17.0)
===========================
Before, pyscf-2.1.0+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (!=1.16,!=1.17,>=1.13)
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created pyscf-2.1.0+computecanada-cp38-cp38-linux_x86_64.whl
After, pyscf-2.1.0+computecanada-cp38-cp38-linux_x86_64.whl:
numpy (!=1.16,!=1.17,>=1.13,<2.0)
===========================
Before, rasterio-1.3.0+gdal351.computecanada-cp310-cp310-linux_x86_64.whl:
numpy
numpydoc ; extra == 'all'
numpydoc ; extra == 'docs'
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created rasterio-1.3.0+gdal351.computecanada-cp310-cp310-linux_x86_64.whl
After, rasterio-1.3.0+gdal351.computecanada-cp310-cp310-linux_x86_64.whl:
numpy (<2.0)
numpydoc ; extra == 'all'
numpydoc ; extra == 'docs'
===========================
Before, tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl:
numpy (>=1.20,<2.0)
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl
After, tf_models_official-2.15.0+computecanada-py2.py3-none-any.whl:
numpy (>=1.20,<2.0)
===========================
Before, tf_models_official-2.17.0+computecanada-py2.py3-none-any.whl:
numpy >=1.20
Adding <2.0:
Since --force was used, overwriting existing wheel
New wheel created tf_models_official-2.17.0+computecanada-py2.py3-none-any.whl
After, tf_models_official-2.17.0+computecanada-py2.py3-none-any.whl:
numpy (>=1.20,<2.0)
===========================
ccoulombe commented 2 months ago

Cases that binds on Numpy are : numpy, numpy>=, numpy>=,<=. but we then adjust them to a minimum version. This means we only should have : numpy>=, numpy>=,<=. Thus generic wheels can be ignored (ones without binding on numpy)

Does it override a case where a range is already define ? ie numpy<=1.23,>=1.19

mboisson commented 2 months ago

Cases that binds on Numpy are : numpy, numpy>=, numpy>=,<=. but we then adjust them to a minimum version. This means we only should have : numpy>=, numpy>=,<=. Thus generic wheels can be ignored (ones without binding on numpy)

tf_models_official-2.17.0+computecanada-py2.py3-none-any.whl has numpy >=1.20 despite being a generic wheel.

and conversely, cftime-1.5.0+computecanada-cp37-cp37m-linux_x86_64.whl has requirement numpy

despite not being a generic wheel.

Does it override a case where a range is already define ? ie numpy<=1.23,>=1.19

It does, but it keeps the stricter version (i.e. in this case, it would keep <=1.23)

mboisson commented 2 months ago

I would say that the logic of ignoring a wheel or not (based on if it is a generic wheel or whatever other criteria) is best dealt with outside of manipulate_wheels.py. manipulate_wheels.py should do what it is asked to do. This is how it is done for the minimum numpy version, that is tested in the build_wheels.sh script.

ccoulombe commented 2 months ago

Yes absolutely, it should do 1 thing and do it right (and not try to determine if or not to do it).

tf_models_official-2.17.0+computecanada-py2.py3-none-any.whl has numpy >=1.20 despite being a generic wheel.

but does it bind to the numpy ABI? It does not.

Then, the code seems ok, I assume you also tested the minimum numpy code afterwards?

mboisson commented 2 months ago

Then, the code seems ok, I assume you also tested the minimum numpy code afterwards?

Mmm, I did not re-test that, but I can run a few test cases.

mboisson commented 2 months ago

Ok, tested on one wheel for Pandas. Before:

./manipulate_wheels.py --print_req --wheel pandas-2.2.1+computecanada-cp311-cp311-linux_x86_64.whl  | grep numpy
numpy (<2,>=1.23)
numpy (<2,>=1.23)
numpy (<2,>=1.23)

then run

./manipulate_wheels.py --set_min_numpy 1.26 --wheel pandas-2.2.1+computecanada-cp311-cp311-linux_x86_64.whl --inplace --force
Since --force was used, overwriting existing wheel
New wheel created pandas-2.2.1+computecanada-cp311-cp311-linux_x86_64.whl

after:

./manipulate_wheels.py --print_req --wheel pandas-2.2.1+computecanada-cp311-cp311-linux_x86_64.whl  | grep numpy
numpy (<2,>=1.26)
numpy (<2,>=1.26)
numpy (<2,>=1.26)

so I think this can be merged.