SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
We previously have the RAY_DEDUP_LOGS set for the ray cluster, but it becomes not effective for the jobs as we got rid of ray job submit #4318 , making the job not inheriting the env var from the ray cluster. We now set those env vars directly to the driver process.
To reproduce:
sky launch --num-nodes 4 echo hi
...
(worker2, rank=2, pid=2043, ip=10.0.2.206) hi
(worker1, rank=1, pid=2038, ip=10.0.2.205) hi [repeated 3x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/ray-logging.html#log-deduplication for more options.)
✓ Job finished (status: SUCCEEDED).
Tested (run the relevant ones):
[ ] Code formatting: bash format.sh
[ ] Any manual or new tests for this PR (please specify below)
We previously have the
RAY_DEDUP_LOGS
set for the ray cluster, but it becomes not effective for the jobs as we got rid ofray job submit
#4318 , making the job not inheriting the env var from the ray cluster. We now set those env vars directly to the driver process.To reproduce:
Tested (run the relevant ones):
bash format.sh
pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
conda deactivate; bash -i tests/backward_compatibility_tests.sh