SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
When a user is trying to launch a large number of nodes in a cluster, sometimes a small portion of nodes may experience failure of being ssh into. Stopping that instance on console manually and relaunch can fix it.
EC2 Instance Connect is unable to connect to your instance. Ensure your instance network settings are configured correctly for EC2 Instance Connect. For more information, see EC2 Instance Connect Prerequisites at https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-connect-prerequisites.html.
When a user is trying to launch a large number of nodes in a cluster, sometimes a small portion of nodes may experience failure of being ssh into. Stopping that instance on console manually and relaunch can fix it.
Version & Commit info:
sky -v
: PLEASE_FILL_INsky -c
: PLEASE_FILL_IN