skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 514 forks source link

[Tests] Add test for `max_restarts_on_errors` #4214

Open Michaelvll opened 3 weeks ago

Michaelvll commented 3 weeks ago

Version & Commit info: