skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.72k stars 496 forks source link

[Core] Launching and job submission fails on MacOS #3859

Closed Michaelvll closed 1 month ago

Michaelvll commented 2 months ago

A few people experience the issue where sky launch task.yaml errors out with 255 and a message saying the message sending through ssh is too long. This is related to our previous optimization: #3394 with the upper bound set to 120KB #3512.

We tried to reduce the upper bound to 90KB but still fails, and it only works until we change the threshold to 0.

Version & Commit info:

TimCJanke commented 2 months ago

Ran into this problem today, the error message was quite confusing tbh. Downgrading to 0.6.0 worked for me.

jucor commented 2 months ago

Ah thanks a lot @Michaelvll , that explains the many failures I'm seeing, new to skypilot 😅 @TimCJanke thanks for the workaround.