astral-sh / uv

An extremely fast Python package and project manager, written in Rust.
https://docs.astral.sh/uv
Apache License 2.0
25.01k stars 723 forks source link

`uv` fails to install nvidia package `nvidia-curand-cu12==10.3.2.106` #1454

Closed strickvl closed 8 months ago

strickvl commented 8 months ago

Installing a bunch of packages from a requirements.txt file fails with the following error:

+ uv pip install -r integration-requirements.txt
Resolved 453 packages in 4.68s
Downloaded 290 packages in 41.83s
error: Failed to install: nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (nvidia-curand-cu12==10.3.2.106)
  Caused by: failed to rename file from /home/runner/_work/zenml/zenml/.venv/lib/python3.10/site-packages/.tmpOgydOs/__init__.py to /home/runner/_work/zenml/zenml/.venv/lib/python3.10/site-packages/nvidia/__init__.py
  Caused by: No such file or directory (os error 2)

This is all happening on self-hosted runners via Github Actions. Script works fine with pip, but with uv we hit this failure. Unsure whether there's something special about this package that doesn't work well with uv?

Running locally (on a Mac) hits this error:

  × No solution found when resolving dependencies:
  ╰─▶ Because nvidia-curand-cu12==10.3.2.106 is unusable because no wheels are available with a
      matching platform and you require nvidia-curand-cu12==10.3.2.106, we can conclude that the
      requirements are unsatisfiable.

...though this is fair enough I suppose.

BurntSushi commented 8 months ago

Sorry but I can't figure out which command to run to try and reproduce this. Can you please provide more detail? I looked at your CI log and tried to naively run the command there:

(.venv) [andrew@duff i1454]$ uv pip install -e .[server,templates,terraform,secrets-aws,secrets-gcp,secrets-azure,secrets-hashicorp,s3fs,gcsfs,adlfs,dev,mlstacks]
error: Failed to build editables
  Caused by: Failed to build editable: file:///home/andrew/astral/issues/uv/i1454
  Caused by: Invalid source distribution: The archive contains neither a `pyproject.toml` nor a `setup.py` file at the top level

And I don't see any integration-requirements.txt file in the linked repo.

strickvl commented 8 months ago

The commands in order:

pip install -e .[server,templates,terraform,secrets-aws,secrets-gcp,secrets-azure,secrets-hashicorp,s3fs,gcsfs,adlfs,dev,mlstacks]

ignore_integrations="feast label_studio bentoml seldon kserve pycaret skypilot_aws skypilot_gcp skypilot_azure"

ignore_integrations_args=""
  for integration in $ignore_integrations; do
      ignore_integrations_args="$ignore_integrations_args --ignore-integration $integration"
  done

zenml integration export-requirements \
        --output-file integration-requirements.txt \
        $ignore_integrations_args

echo "" >> integration-requirements.txt
    echo "pyyaml>=6.0.1" >> integration-requirements.txt

    pip install -r integration-requirements.txt

Our CI is basically running this script when it sets up the environment. The script itself is executed from a workflow file here.

BurntSushi commented 8 months ago

It sounds like that script specifically needs to be run from the root of that repo too?

charliermarsh commented 8 months ago

I can see where (in the code) this is happening, but I don't yet see how.

charliermarsh commented 8 months ago

This may be fixed by https://github.com/astral-sh/uv/pull/1546.

charliermarsh commented 8 months ago

Sorry, didn't mean for this to close on-merge, so it still might fail -- needs to be tested after release.

charliermarsh commented 8 months ago

v0.1.3 is out now. Do you mind retrying? (It may not solve it, but it could.)

strickvl commented 8 months ago

@charliermarsh this seems to fix it. I hit a new error, but at least the nvidia package issue installs fine now. Thanks!

raayu83 commented 2 months ago

Hi @charliermarsh ,

with the high level API I get a an error for this package as well (slightly different version):

error: Failed to download `nvidia-curand-cu12==10.3.2.56`
  Caused by: Failed to unzip wheel: nvidia_curand_cu12-10.3.2.56-py3-none-manylinux1_x86_64.whl
  Caused by: an upstream reader returned an error: error decoding response body
  Caused by: error decoding response body
  Caused by: request or response body error
  Caused by: error reading a body from connection
  Caused by: peer closed connection without sending TLS close_notify: https://docs.rs/rustls/latest/rustls/manual/_03_howto/index.html#unexpected-eof

Should I open a new issue or should this be handled here?