skyplane-project / skyplane

🔥 Blazing fast bulk data transfers between any cloud 🔥
https://skyplane.org
Apache License 2.0
999 stars 58 forks source link

[bug] ECS may failed when the ECS sold out #889

Open killerdbob opened 1 year ago

killerdbob commented 1 year ago

Describe the bug

Hi, The latest feature quota of the ECS, may have some problems. If the ECS in the region has sold out, the program will fail.

sarahwooders commented 1 year ago

@killerdbob thanks for reporting this! What do you mean by the region is "sold out"? Also are you referring to a specific cloud?

Thanks,

Sarah

killerdbob commented 1 year ago

hi @sarahwooders, For example, “n2-standard-96” in some GCP regions is "sold out" or "unavailable for some reason".however, the quotas may still be big enough for the user, this will crash.

It is for general clouds. Skyplane's assumption is that "there is an unlimited resource in every region", but the reality is that the instance type is limited in every region, or maybe the ECS type is unavailable for some reason.

I found that there is SDK to check if the ECS type is "sold out" in the region.

Wei

killerdbob commented 1 year ago

You could follow up with Skypilot (https://github.com/skypilot-org/skypilot), they maintain a csv for every cloud. They record every "available" ECS type in every "region", this prevented the user to provision "unavailable" ECS type.

I also fix the cover according to Skypilot, just try to make Skyplane looks nicer. (https://github.com/skyplane-project/skyplane/pull/888)

Are Skypilot and Skyplane the same team?