Closed romilbhardwaj closed 10 hours ago
Starting to run some final tests. @cg505 @Michaelvll if you find some time please do a quick round of reviews. Thanks!
Should add uv to our base images. Takes ~2s to install it otherwise.
Should add uv to our base images. Takes ~2s to install it otherwise.
working on this now
Running smoke tests:
Ran backwards compatibility tests, no issues.
Smoke tests for aws and k8s pass (barring a few unrelated failures). Merging now. Thanks for the great work @cg505 and @Michaelvll!
This PR introduces a bunch of optimizations for large scale k8s provisioning, including:
Delivers > 4x speedup when provisioning 100s of nodes.
Similar times with a NeMo derived image optimized with instructions in this PR.
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh