skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.82k stars 513 forks source link

[k8s] Handle apt update log not existing #4381

Closed romilbhardwaj closed 3 days ago

romilbhardwaj commented 3 days ago

apt update log from container init may not be written by the time we check it in our provisioner. This PR handles that by retrying till the timeout is hit. PR tested/verified by user running into the bug.