Closed njzjz closed 1 week ago
The recent updates standardize the installation process for dependencies like TensorFlow, Torch, and mpi4py
across several GitHub workflow files. The key change involves replacing python -m uv pip install
with a new script, uv_with_retry.sh
, which includes a retry mechanism to handle installation errors more robustly. This ensures consistent and reliable installations during the build and testing phases for various configurations.
Files/Paths | Change Summary |
---|---|
.github/workflows/build_cc.yml |
Updated installation commands to use source/install/uv_with_retry.sh for TensorFlow and other dependencies. |
.github/workflows/test_cc.yml |
Standardized installation commands using source/install/uv_with_retry.sh . |
.github/workflows/test_cuda.yml |
Modified installation commands to implement uv_with_retry.sh script for TensorFlow, Torch, and mpi4py . |
.github/workflows/test_python.yml |
Revised installation process for mpich , torch , horovod , and mpi4py to utilize source/install/uv_with_retry.sh . |
source/install/uv_with_retry.sh |
Introduced a script that retries the uv command up to 3 times on encountering the "error decoding response body" error. |
sequenceDiagram
participant Developer
participant CI/CD Pipeline
participant UV_with_retry.sh
participant Dependency Server
Developer->>CI/CD Pipeline: Triggers workflow
CI/CD Pipeline->>UV_with_retry.sh: Run install command
UV_with_retry.sh->>Dependency Server: Attempt to install dependency
Dependency Server-->>UV_with_retry.sh: Error response (if any)
UV_with_retry.sh->>UV_with_retry.sh: Retries up to 3 times on error
UV_with_retry.sh-->>CI/CD Pipeline: Returns success/failure
CI/CD Pipeline-->>Developer: Reports build/test result
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 82.74%. Comparing base (
0c472d1
) to head (facf45b
).
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
This PR uses a shell wrapper to check if the
error decoding response body
error message is in the uv stderr and retry if so. It is just a workaround for https://github.com/astral-sh/uv/issues/2586 and https://github.com/astral-sh/uv/issues/3514 and hope the upstream can fix it.Note that this PR does nothing with cibuildwheel. It's unclear how to retry with certain errors under its complex logic (feature requested in https://github.com/pypa/cibuildwheel/issues/1846).
Summary by CodeRabbit
uv_with_retry.sh
script to ensure reliable installations.