Closed dzlab closed 4 months ago
That's a great catch @dzlab! We should fix it. As a workaround, you could give the cluster a name by passing say -c finetune
to sky launch
.
@concretevitamin thanks for the hint, but unfortunately I still get the same error which is preventing instance creation. Here is log, you can see it is using finetune
as cluster name but it's not able create instances:
Task from YAML spec: axolotl.yaml
I 05-08 22:14:13 optimizer.py:694] == Optimizer ==
I 05-08 22:14:13 optimizer.py:705] Target: minimizing cost
I 05-08 22:14:13 optimizer.py:717] Estimated cost: $3.0 / hour
I 05-08 22:14:13 optimizer.py:717]
I 05-08 22:14:13 optimizer.py:842] Considered resources (1 node):
I 05-08 22:14:13 optimizer.py:912] ---------------------------------------------------------------------------------------------
I 05-08 22:14:13 optimizer.py:912] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN
I 05-08 22:14:13 optimizer.py:912] ---------------------------------------------------------------------------------------------
I 05-08 22:14:13 optimizer.py:912] GCP n1-highmem-8 8 52 L4:1 us-central1-a 2.95 ✔
I 05-08 22:14:13 optimizer.py:912] ---------------------------------------------------------------------------------------------
I 05-08 22:14:13 optimizer.py:912]
Launching a new cluster 'finetune'. Proceed? [Y/n]: Y
I 05-08 22:14:26 cloud_vm_ray_backend.py:4250] Creating a new cluster: 'finetune' [1x GCP(n1-highmem-8, {'L4': 1})].
I 05-08 22:14:26 cloud_vm_ray_backend.py:4250] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters.
I 05-08 22:14:29 cloud_vm_ray_backend.py:1371] To view detailed progress: tail -n100 -f /Users/firstname.lastname/sky_logs/sky-2024-05-08-22-14-09-156922/provision.log
I 05-08 22:14:34 provisioner.py:77] Launching on GCP us-central1 (us-central1-a)
W 05-08 22:14:50 instance_utils.py:112] Got return code 'invalid' in us-central1-a: "Invalid value for field 'resource.instanceProperties.labels': ''. Label value 'firstname.lastname' violates format constraints. The value can only contain lowercase letters, numeric characters, underscores and dashes. The value can be at most 63 characters long. International characters are allowed"
W 05-08 22:14:52 cloud_vm_ray_backend.py:2036] sky.exceptions.ResourcesUnavailableError: Failed to acquire resources in us-central1-a. Try changing resource requirements or use another zone.
W 05-08 22:14:52 cloud_vm_ray_backend.py:2045]
W 05-08 22:14:52 cloud_vm_ray_backend.py:2045] Provision failed for 1x GCP(n1-highmem-8, {'L4': 1}) in us-central1-a. Trying other locations (if any).
@concretevitamin thanks for the hint, but unfortunately I still get the same error which is preventing instance creation. Here is log, you can see it is using
finetune
as cluster name but it's not able create instances:Task from YAML spec: axolotl.yaml I 05-08 22:14:13 optimizer.py:694] == Optimizer == I 05-08 22:14:13 optimizer.py:705] Target: minimizing cost I 05-08 22:14:13 optimizer.py:717] Estimated cost: $3.0 / hour I 05-08 22:14:13 optimizer.py:717] I 05-08 22:14:13 optimizer.py:842] Considered resources (1 node): I 05-08 22:14:13 optimizer.py:912] --------------------------------------------------------------------------------------------- I 05-08 22:14:13 optimizer.py:912] CLOUD INSTANCE vCPUs Mem(GB) ACCELERATORS REGION/ZONE COST ($) CHOSEN I 05-08 22:14:13 optimizer.py:912] --------------------------------------------------------------------------------------------- I 05-08 22:14:13 optimizer.py:912] GCP n1-highmem-8 8 52 L4:1 us-central1-a 2.95 ✔ I 05-08 22:14:13 optimizer.py:912] --------------------------------------------------------------------------------------------- I 05-08 22:14:13 optimizer.py:912] Launching a new cluster 'finetune'. Proceed? [Y/n]: Y I 05-08 22:14:26 cloud_vm_ray_backend.py:4250] Creating a new cluster: 'finetune' [1x GCP(n1-highmem-8, {'L4': 1})]. I 05-08 22:14:26 cloud_vm_ray_backend.py:4250] Tip: to reuse an existing cluster, specify --cluster (-c). Run `sky status` to see existing clusters. I 05-08 22:14:29 cloud_vm_ray_backend.py:1371] To view detailed progress: tail -n100 -f /Users/firstname.lastname/sky_logs/sky-2024-05-08-22-14-09-156922/provision.log I 05-08 22:14:34 provisioner.py:77] Launching on GCP us-central1 (us-central1-a) W 05-08 22:14:50 instance_utils.py:112] Got return code 'invalid' in us-central1-a: "Invalid value for field 'resource.instanceProperties.labels': ''. Label value 'firstname.lastname' violates format constraints. The value can only contain lowercase letters, numeric characters, underscores and dashes. The value can be at most 63 characters long. International characters are allowed" W 05-08 22:14:52 cloud_vm_ray_backend.py:2036] sky.exceptions.ResourcesUnavailableError: Failed to acquire resources in us-central1-a. Try changing resource requirements or use another zone. W 05-08 22:14:52 cloud_vm_ray_backend.py:2045] W 05-08 22:14:52 cloud_vm_ray_backend.py:2045] Provision failed for 1x GCP(n1-highmem-8, {'L4': 1}) in us-central1-a. Trying other locations (if any).
Thanks for reporting this @dzlab! It seems indeed an issue for the username containing invalid characters. Just submit a PR for this. We will try to get it merged soon.
I'm trying to run the axolotl example on gcp, I do this
But it is not able to create instances on gcp, it seems because GCP is rejecting the value for
resource.instanceProperties.labels
as it contains.
which is not valid. This is the error message from the logThis is the task full log
Not sure where this value is set in
skypolit
but we could avoid such error if we escape.
and any other invalid character?Version & Commit info:
sky -v
:skypilot, version 1.0.0.dev20240507
sky -c
:skypilot, commit 904aa5cb6b59a550d39ae87a2bbfe32e4c53b8b4