skypilot-org / skypilot

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12+ clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.
https://skypilot.readthedocs.io
Apache License 2.0
6.81k stars 513 forks source link

[Serve] Update log pattern in `_follow_replica_logs` for new UX 3.0 #4333

Closed andylizf closed 1 week ago

andylizf commented 1 week ago

Blocked by #4323

Update log file patterns used by _follow_replica_logs to match new UX 3.0 output format introduced in #4023. The old patterns were matching tail -n100 -f which is no longer present in the new UX.

Tested (run the relevant ones):

FYK, here are the logs I recorded:

Logs
Start streaming logs for launching process of replica 1.
I 11-11 22:14:42 storage.py:873] Storage type StoreType.GCS already exists.
I 11-11 22:14:42 replica_managers.py:84] Launching replica (id: 1) cluster new-http-1 with resources: {GCP(cpus=2+, ports=['8080'])}
SkyPilot collects usage data to improve its services. `setup` and `run` commands are not collected to ensure privacy.
Usage logging can be disabled by setting the environment variable SKYPILOT_DISABLE_USAGE_COLLECTION=1.
I 11-11 22:14:45 optimizer.py:737] Target: minimizing cost
I 11-11 22:14:45 optimizer.py:750] Estimated cost: $0.1 / hour
I 11-11 22:14:45 optimizer.py:750] 
I 11-11 22:14:45 optimizer.py:885] Considered resources (1 node):
I 11-11 22:14:45 optimizer.py:955] ----------------------------------------------------------------------------------------------
I 11-11 22:14:45 optimizer.py:955]  CLOUD   INSTANCE        vCPUs   Mem(GB)   ACCELERATORS   REGION/ZONE     COST ($)   CHOSEN   
I 11-11 22:14:45 optimizer.py:955] ----------------------------------------------------------------------------------------------
I 11-11 22:14:45 optimizer.py:955]  GCP     n2-standard-2   2       8         -              us-central1-a   0.10          ✔     
I 11-11 22:14:45 optimizer.py:955] ----------------------------------------------------------------------------------------------
I 11-11 22:14:48 cloud_vm_ray_backend.py:1505] ⚙︎ Launching on GCP us-central1 (us-central1-a).
I 11-11 22:15:28 provisioner.py:445] └── Instance is up.
D 11-11 22:14:48 provisioner.py:135] SkyPilot version: 1.0.0-dev0; commit: c8af39739ab1398a3d7bf1339bd35fa71331a95c
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:137] ==================== Provisioning ====================
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:138] Provision config:
D 11-11 22:14:48 provisioner.py:138] {
D 11-11 22:14:48 provisioner.py:138]   "provider_config": {
D 11-11 22:14:48 provisioner.py:138]     "type": "external",
D 11-11 22:14:48 provisioner.py:138]     "module": "sky.provision.gcp",
D 11-11 22:14:48 provisioner.py:138]     "region": "us-central1",
D 11-11 22:14:48 provisioner.py:138]     "availability_zone": "us-central1-a",
D 11-11 22:14:48 provisioner.py:138]     "cache_stopped_nodes": true,
D 11-11 22:14:48 provisioner.py:138]     "project_id": "psychic-order-437203-r7",
D 11-11 22:14:48 provisioner.py:138]     "firewall_rule": "sky-ports-new-http-2-6eab",
D 11-11 22:14:48 provisioner.py:138]     "use_internal_ips": false,
D 11-11 22:14:48 provisioner.py:138]     "force_enable_external_ips": false,
D 11-11 22:14:48 provisioner.py:138]     "disable_launch_config_check": true,
D 11-11 22:14:48 provisioner.py:138]     "use_managed_instance_group": false
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "authentication_config": {
D 11-11 22:14:48 provisioner.py:138]     "ssh_user": "gcpuser",
D 11-11 22:14:48 provisioner.py:138]     "ssh_private_key": "~/.ssh/sky-key"
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "docker_config": {},
D 11-11 22:14:48 provisioner.py:138]   "node_config": {
D 11-11 22:14:48 provisioner.py:138]     "labels": {
D 11-11 22:14:48 provisioner.py:138]       "skypilot-user": "andyl",
D 11-11 22:14:48 provisioner.py:138]       "use-managed-instance-group": "0"
D 11-11 22:14:48 provisioner.py:138]     },
D 11-11 22:14:48 provisioner.py:138]     "machineType": "n2-standard-2",
D 11-11 22:14:48 provisioner.py:138]     "disks": [
D 11-11 22:14:48 provisioner.py:138]       {
D 11-11 22:14:48 provisioner.py:138]         "boot": true,
D 11-11 22:14:48 provisioner.py:138]         "autoDelete": true,
D 11-11 22:14:48 provisioner.py:138]         "type": "PERSISTENT",
D 11-11 22:14:48 provisioner.py:138]         "initializeParams": {
D 11-11 22:14:48 provisioner.py:138]           "diskSizeGb": 256,
D 11-11 22:14:48 provisioner.py:138]           "sourceImage": "projects/sky-dev-465/global/images/skypilot-gcp-cpu-ubuntu-241030",
D 11-11 22:14:48 provisioner.py:138]           "diskType": "zones/us-central1-a/diskTypes/pd-balanced"
D 11-11 22:14:48 provisioner.py:138]         }
D 11-11 22:14:48 provisioner.py:138]       }
D 11-11 22:14:48 provisioner.py:138]     ],
D 11-11 22:14:48 provisioner.py:138]     "metadata": {
D 11-11 22:14:48 provisioner.py:138]       "items": [
D 11-11 22:14:48 provisioner.py:138]         {
D 11-11 22:14:48 provisioner.py:138]           "key": "ssh-keys",
D 11-11 22:14:48 provisioner.py:138]           "value": "gcpuser:ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDwKA5nK7TavUub8BXVX3sVr81QECMjl7rKoG5Sa8lU+IyZxdlza1aM9INgYag6F83W1XFHpYZY+b2lrcVeVV+jr3MX7ynWXY+PZjlNDS2jmHOMBbkNTL2Y5DuubmhvMC52FUmg8z3VFLwVYeWM8//45nCSGBOTIe2V7B9A0TFmDGKl/xDEuKL9GvwqUHHvOTfLZ8s5HL199qXyzhRChXJaF0e5RoS4yWtVqjAu04KxLJjU5mNGTlsuWg+2yVifTQt4jT6H4bkedC8UIk60xCuu1SGLZwPVrfLbjFcHdgZxLzPEodE3DthOSGgu/fR/KcpbR8yIQLrOyw9kilKJDUf/"
D 11-11 22:14:48 provisioner.py:138]         }
D 11-11 22:14:48 provisioner.py:138]       ]
D 11-11 22:14:48 provisioner.py:138]     }
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "count": 1,
D 11-11 22:14:48 provisioner.py:138]   "tags": {},
D 11-11 22:14:48 provisioner.py:138]   "resume_stopped_nodes": true,
D 11-11 22:14:48 provisioner.py:138]   "ports_to_open_on_launch": null
D 11-11 22:14:48 provisioner.py:138] }
D 11-11 22:14:48 config.py:117] gcp_credentials not found in cluster yaml file. Falling back to GOOGLE_APPLICATION_CREDENTIALS environment variable.
D 11-11 22:14:48 provisioner.py:135] SkyPilot version: 1.0.0-dev0; commit: c8af39739ab1398a3d7bf1339bd35fa71331a95c
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:137] ==================== Provisioning ====================
D 11-11 22:14:48 provisioner.py:137] 
D 11-11 22:14:48 provisioner.py:138] Provision config:
D 11-11 22:14:48 provisioner.py:138] {
D 11-11 22:14:48 provisioner.py:138]   "provider_config": {
D 11-11 22:14:48 provisioner.py:138]     "type": "external",
D 11-11 22:14:48 provisioner.py:138]     "module": "sky.provision.gcp",
D 11-11 22:14:48 provisioner.py:138]     "region": "us-central1",
D 11-11 22:14:48 provisioner.py:138]     "availability_zone": "us-central1-a",
D 11-11 22:14:48 provisioner.py:138]     "cache_stopped_nodes": true,
D 11-11 22:14:48 provisioner.py:138]     "project_id": "psychic-order-437203-r7",
D 11-11 22:14:48 provisioner.py:138]     "firewall_rule": "sky-ports-new-http-1-6eab",
D 11-11 22:14:48 provisioner.py:138]     "use_internal_ips": false,
D 11-11 22:14:48 provisioner.py:138]     "force_enable_external_ips": false,
D 11-11 22:14:48 provisioner.py:138]     "disable_launch_config_check": true,
D 11-11 22:14:48 provisioner.py:138]     "use_managed_instance_group": false
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "authentication_config": {
D 11-11 22:14:48 provisioner.py:138]     "ssh_user": "gcpuser",
D 11-11 22:14:48 provisioner.py:138]     "ssh_private_key": "~/.ssh/sky-key"
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "docker_config": {},
D 11-11 22:14:48 provisioner.py:138]   "node_config": {
D 11-11 22:14:48 provisioner.py:138]     "labels": {
D 11-11 22:14:48 provisioner.py:138]       "skypilot-user": "andyl",
D 11-11 22:14:48 provisioner.py:138]       "use-managed-instance-group": "0"
D 11-11 22:14:48 provisioner.py:138]     },
D 11-11 22:14:48 provisioner.py:138]     "machineType": "n2-standard-2",
D 11-11 22:14:48 provisioner.py:138]     "disks": [
D 11-11 22:14:48 provisioner.py:138]       {
D 11-11 22:14:48 provisioner.py:138]         "boot": true,
D 11-11 22:14:48 provisioner.py:138]         "autoDelete": true,
D 11-11 22:14:48 provisioner.py:138]         "type": "PERSISTENT",
D 11-11 22:14:48 provisioner.py:138]         "initializeParams": {
D 11-11 22:14:48 provisioner.py:138]           "diskSizeGb": 256,
D 11-11 22:14:48 provisioner.py:138]           "sourceImage": "projects/sky-dev-465/global/images/skypilot-gcp-cpu-ubuntu-241030",
D 11-11 22:14:48 provisioner.py:138]           "diskType": "zones/us-central1-a/diskTypes/pd-balanced"
D 11-11 22:14:48 provisioner.py:138]         }
D 11-11 22:14:48 provisioner.py:138]       }
D 11-11 22:14:48 provisioner.py:138]     ],
D 11-11 22:14:48 provisioner.py:138]     "metadata": {
D 11-11 22:14:48 provisioner.py:138]       "items": [
D 11-11 22:14:48 provisioner.py:138]         {
D 11-11 22:14:48 provisioner.py:138]           "key": "ssh-keys",
D 11-11 22:14:48 provisioner.py:138]           "value": "gcpuser:ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDwKA5nK7TavUub8BXVX3sVr81QECMjl7rKoG5Sa8lU+IyZxdlza1aM9INgYag6F83W1XFHpYZY+b2lrcVeVV+jr3MX7ynWXY+PZjlNDS2jmHOMBbkNTL2Y5DuubmhvMC52FUmg8z3VFLwVYeWM8//45nCSGBOTIe2V7B9A0TFmDGKl/xDEuKL9GvwqUHHvOTfLZ8s5HL199qXyzhRChXJaF0e5RoS4yWtVqjAu04KxLJjU5mNGTlsuWg+2yVifTQt4jT6H4bkedC8UIk60xCuu1SGLZwPVrfLbjFcHdgZxLzPEodE3DthOSGgu/fR/KcpbR8yIQLrOyw9kilKJDUf/"
D 11-11 22:14:48 provisioner.py:138]         }
D 11-11 22:14:48 provisioner.py:138]       ]
D 11-11 22:14:48 provisioner.py:138]     }
D 11-11 22:14:48 provisioner.py:138]   },
D 11-11 22:14:48 provisioner.py:138]   "count": 1,
D 11-11 22:14:48 provisioner.py:138]   "tags": {},
D 11-11 22:14:48 provisioner.py:138]   "resume_stopped_nodes": true,
D 11-11 22:14:48 provisioner.py:138]   "ports_to_open_on_launch": null
D 11-11 22:14:48 provisioner.py:138] }
D 11-11 22:14:48 config.py:117] gcp_credentials not found in cluster yaml file. Falling back to GOOGLE_APPLICATION_CREDENTIALS environment variable.
I 11-11 22:14:51 config.py:217] _configure_iam_role: Checking permissions for skypilot-v1@psychic-order-437203-r7.iam.gserviceaccount.com...
I 11-11 22:14:51 config.py:217] _configure_iam_role: Checking permissions for skypilot-v1@psychic-order-437203-r7.iam.gserviceaccount.com...
I 11-11 22:14:51 config.py:613] get_usable_vpc: Found a usable VPC network 'default'.
I 11-11 22:14:51 config.py:613] get_usable_vpc: Found a usable VPC network 'default'.
I 11-11 22:14:53 instance.py:212] []
D 11-11 22:14:53 instance_utils.py:802] Launching GCP instances with "bulkInsert" ...
I 11-11 22:14:53 instance.py:212] []
D 11-11 22:14:53 instance_utils.py:802] Launching GCP instances with "bulkInsert" ...
D 11-11 22:14:55 instance_utils.py:886] Waiting GCP instances to be ready ...
D 11-11 22:14:55 instance_utils.py:886] Waiting GCP instances to be ready ...
D 11-11 22:14:56 instance_utils.py:431] Waiting GCP operation operation-1731363294065-626aa6e1eb230-9b4072c5-506532cc to be ready ...
D 11-11 22:14:56 instance_utils.py:431] Waiting GCP operation operation-1731363294073-626aa6e1ed244-726b04e4-a3382ac8 to be ready ...
D 11-11 22:15:15 instance_utils.py:461] wait_operations: Failed to create instances. Reason: [{'code': 'VM_MIN_COUNT_NOT_REACHED', 'message': 'Requested minimum count of 1 VMs could not be created.'}, {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'SSD_TOTAL_GB' exceeded.  Limit: 500.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/ssd_total_storage', 'limitName': 'SSD-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 500}}]}]
W 11-11 22:15:15 instance_utils.py:112] Got return codes 'VM_MIN_COUNT_NOT_REACHED', 'QUOTA_EXCEEDED' in us-central1-a: 'Requested minimum count of 1 VMs could not be created'; "Quota 'SSD_TOTAL_GB' exceeded.  Limit: 500.0 in region us-central1"
D 11-11 22:15:15 provisioner.py:150] Failed to provision 'new-http-2' on GCP (us-central1-a).
D 11-11 22:15:15 provisioner.py:152] bulk_provision for 'new-http-2' failed. Stacktrace:
D 11-11 22:15:15 provisioner.py:152] Traceback (most recent call last):
D 11-11 22:15:15 provisioner.py:152]   File "/home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 141, in bulk_provision
D 11-11 22:15:15 provisioner.py:152]     return _bulk_provision(cloud, region, cluster_name,
D 11-11 22:15:15 provisioner.py:152]   File "/home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/sky/provision/provisioner.py", line 63, in _bulk_provision
D 11-11 22:15:15 provisioner.py:152]     provision_record = provision.run_instances(provider_name,
D 11-11 22:15:15 provisioner.py:152]   File "/home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/sky/provision/__init__.py", line 50, in _wrapper
D 11-11 22:15:15 provisioner.py:152]     return impl(*args, **kwargs)
D 11-11 22:15:15 provisioner.py:152]   File "/home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 360, in run_instances
D 11-11 22:15:15 provisioner.py:152]     return _run_instances(region, cluster_name_on_cloud, config)
D 11-11 22:15:15 provisioner.py:152]   File "/home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/sky/provision/gcp/instance.py", line 301, in _run_instances
D 11-11 22:15:15 provisioner.py:152]     raise error
D 11-11 22:15:15 provisioner.py:152] sky.provision.common.ProvisionerError: Failed to launch instances.
D 11-11 22:15:15 provisioner.py:152] 
D 11-11 22:15:15 provisioner.py:157] Terminating the failed cluster.
D 11-11 22:15:15 instance.py:36] handlers: []
D 11-11 22:15:16 instance.py:47] handler_to_instances: defaultdict(, {})
D 11-11 22:15:16 metadata_utils.py:115] Remove metadata of cluster new-http-2-6eab.
D 11-11 22:15:18 instance_utils.py:431] Waiting GCP operation operation-1731363317093-626aa6f7e13d1-912ef3cf-1bcc858b to be ready ...
D 11-11 22:15:21 provisioner.py:135] SkyPilot version: 1.0.0-dev0; commit: c8af39739ab1398a3d7bf1339bd35fa71331a95c
D 11-11 22:15:21 provisioner.py:137] 
D 11-11 22:15:21 provisioner.py:137] 
D 11-11 22:15:21 provisioner.py:137] ==================== Provisioning ====================
D 11-11 22:15:21 provisioner.py:137] 
D 11-11 22:15:21 provisioner.py:138] Provision config:
D 11-11 22:15:21 provisioner.py:138] {
D 11-11 22:15:21 provisioner.py:138]   "provider_config": {
D 11-11 22:15:21 provisioner.py:138]     "type": "external",
D 11-11 22:15:21 provisioner.py:138]     "module": "sky.provision.gcp",
D 11-11 22:15:21 provisioner.py:138]     "region": "us-east1",
D 11-11 22:15:21 provisioner.py:138]     "availability_zone": "us-east1-b",
D 11-11 22:15:21 provisioner.py:138]     "cache_stopped_nodes": true,
D 11-11 22:15:21 provisioner.py:138]     "project_id": "psychic-order-437203-r7",
D 11-11 22:15:21 provisioner.py:138]     "firewall_rule": "sky-ports-new-http-2-6eab",
D 11-11 22:15:21 provisioner.py:138]     "use_internal_ips": false,
D 11-11 22:15:21 provisioner.py:138]     "force_enable_external_ips": false,
D 11-11 22:15:21 provisioner.py:138]     "disable_launch_config_check": true,
D 11-11 22:15:21 provisioner.py:138]     "use_managed_instance_group": false
D 11-11 22:15:21 provisioner.py:138]   },
D 11-11 22:15:21 provisioner.py:138]   "authentication_config": {
D 11-11 22:15:21 provisioner.py:138]     "ssh_user": "gcpuser",
D 11-11 22:15:21 provisioner.py:138]     "ssh_private_key": "~/.ssh/sky-key"
D 11-11 22:15:21 provisioner.py:138]   },
D 11-11 22:15:21 provisioner.py:138]   "docker_config": {},
D 11-11 22:15:21 provisioner.py:138]   "node_config": {
D 11-11 22:15:21 provisioner.py:138]     "labels": {
D 11-11 22:15:21 provisioner.py:138]       "skypilot-user": "andyl",
D 11-11 22:15:21 provisioner.py:138]       "use-managed-instance-group": "0"
D 11-11 22:15:21 provisioner.py:138]     },
D 11-11 22:15:21 provisioner.py:138]     "machineType": "n2-standard-2",
D 11-11 22:15:21 provisioner.py:138]     "disks": [
D 11-11 22:15:21 provisioner.py:138]       {
D 11-11 22:15:21 provisioner.py:138]         "boot": true,
D 11-11 22:15:21 provisioner.py:138]         "autoDelete": true,
D 11-11 22:15:21 provisioner.py:138]         "type": "PERSISTENT",
D 11-11 22:15:21 provisioner.py:138]         "initializeParams": {
D 11-11 22:15:21 provisioner.py:138]           "diskSizeGb": 256,
D 11-11 22:15:21 provisioner.py:138]           "sourceImage": "projects/sky-dev-465/global/images/skypilot-gcp-cpu-ubuntu-241030",
D 11-11 22:15:21 provisioner.py:138]           "diskType": "zones/us-east1-b/diskTypes/pd-balanced"
D 11-11 22:15:21 provisioner.py:138]         }
D 11-11 22:15:21 provisioner.py:138]       }
D 11-11 22:15:21 provisioner.py:138]     ],
D 11-11 22:15:21 provisioner.py:138]     "metadata": {
D 11-11 22:15:21 provisioner.py:138]       "items": [
D 11-11 22:15:21 provisioner.py:138]         {
D 11-11 22:15:21 provisioner.py:138]           "key": "ssh-keys",
D 11-11 22:15:21 provisioner.py:138]           "value": "gcpuser:ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDwKA5nK7TavUub8BXVX3sVr81QECMjl7rKoG5Sa8lU+IyZxdlza1aM9INgYag6F83W1XFHpYZY+b2lrcVeVV+jr3MX7ynWXY+PZjlNDS2jmHOMBbkNTL2Y5DuubmhvMC52FUmg8z3VFLwVYeWM8//45nCSGBOTIe2V7B9A0TFmDGKl/xDEuKL9GvwqUHHvOTfLZ8s5HL199qXyzhRChXJaF0e5RoS4yWtVqjAu04KxLJjU5mNGTlsuWg+2yVifTQt4jT6H4bkedC8UIk60xCuu1SGLZwPVrfLbjFcHdgZxLzPEodE3DthOSGgu/fR/KcpbR8yIQLrOyw9kilKJDUf/"
D 11-11 22:15:21 provisioner.py:138]         }
D 11-11 22:15:21 provisioner.py:138]       ]
D 11-11 22:15:21 provisioner.py:138]     }
D 11-11 22:15:21 provisioner.py:138]   },
D 11-11 22:15:21 provisioner.py:138]   "count": 1,
D 11-11 22:15:21 provisioner.py:138]   "tags": {},
D 11-11 22:15:21 provisioner.py:138]   "resume_stopped_nodes": true,
D 11-11 22:15:21 provisioner.py:138]   "ports_to_open_on_launch": null
D 11-11 22:15:21 provisioner.py:138] }
D 11-11 22:15:21 config.py:117] gcp_credentials not found in cluster yaml file. Falling back to GOOGLE_APPLICATION_CREDENTIALS environment variable.
D 11-11 22:15:22 provisioner.py:69] 
D 11-11 22:15:22 provisioner.py:69] Waiting for instances of 'new-http-1' to be ready...
D 11-11 22:15:23 provisioner.py:89] Instances of 'new-http-1' are ready after 0 retries.
D 11-11 22:15:23 provisioner.py:92] 
D 11-11 22:15:23 provisioner.py:92] Provisioning 'new-http-1' took 34.73 seconds.
I 11-11 22:15:23 config.py:217] _configure_iam_role: Checking permissions for skypilot-v1@psychic-order-437203-r7.iam.gserviceaccount.com...
D 11-11 22:15:24 provisioner.py:575] 
D 11-11 22:15:24 provisioner.py:575] 
D 11-11 22:15:24 provisioner.py:575] ==================== System Setup After Provision ====================
D 11-11 22:15:24 provisioner.py:575] 
D 11-11 22:15:24 instance.py:36] handlers: []
I 11-11 22:15:24 config.py:613] get_usable_vpc: Found a usable VPC network 'default'.
D 11-11 22:15:25 instance.py:47] handler_to_instances: defaultdict(, {: ['new-http-1-6eab-head-ameg8r3q-compute']})
D 11-11 22:15:25 instance.py:36] handlers: []
I 11-11 22:15:26 instance.py:212] []
D 11-11 22:15:26 instance_utils.py:802] Launching GCP instances with "bulkInsert" ...
D 11-11 22:15:26 instance.py:47] handler_to_instances: defaultdict(, {: ['new-http-1-6eab-head-ameg8r3q-compute']})
D 11-11 22:15:26 provisioner.py:411] Provision record:
D 11-11 22:15:26 provisioner.py:411] {
D 11-11 22:15:26 provisioner.py:411]   "provider_name": "gcp",
D 11-11 22:15:26 provisioner.py:411]   "region": "us-central1",
D 11-11 22:15:26 provisioner.py:411]   "zone": "us-central1-a",
D 11-11 22:15:26 provisioner.py:411]   "cluster_name": "new-http-1-6eab",
D 11-11 22:15:26 provisioner.py:411]   "head_instance_id": "new-http-1-6eab-head-ameg8r3q-compute",
D 11-11 22:15:26 provisioner.py:411]   "resumed_instance_ids": [],
D 11-11 22:15:26 provisioner.py:411]   "created_instance_ids": [
D 11-11 22:15:26 provisioner.py:411]     "new-http-1-6eab-head-ameg8r3q-compute"
D 11-11 22:15:26 provisioner.py:411]   ]
D 11-11 22:15:26 provisioner.py:411] }
D 11-11 22:15:26 provisioner.py:411] Cluster info:
D 11-11 22:15:26 provisioner.py:411] {
D 11-11 22:15:26 provisioner.py:411]   "instances": {
D 11-11 22:15:26 provisioner.py:411]     "new-http-1-6eab-head-ameg8r3q-compute": [
D 11-11 22:15:26 provisioner.py:411]       {
D 11-11 22:15:26 provisioner.py:411]         "instance_id": "new-http-1-6eab-head-ameg8r3q-compute",
D 11-11 22:15:26 provisioner.py:411]         "internal_ip": "10.128.0.19",
D 11-11 22:15:26 provisioner.py:411]         "external_ip": "34.28.7.77",
D 11-11 22:15:26 provisioner.py:411]         "tags": {
D 11-11 22:15:26 provisioner.py:411]           "skypilot-user": "andyl",
D 11-11 22:15:26 provisioner.py:411]           "use-managed-instance-group": "0",
D 11-11 22:15:26 provisioner.py:411]           "ray-cluster-name": "new-http-1-6eab",
D 11-11 22:15:26 provisioner.py:411]           "skypilot-cluster-name": "new-http-1-6eab",
D 11-11 22:15:26 provisioner.py:411]           "ray-node-type": "head",
D 11-11 22:15:26 provisioner.py:411]           "skypilot-head-node": "1"
D 11-11 22:15:26 provisioner.py:411]         },
D 11-11 22:15:26 provisioner.py:411]         "ssh_port": 22
D 11-11 22:15:26 provisioner.py:411]       }
D 11-11 22:15:26 provisioner.py:411]     ]
D 11-11 22:15:26 provisioner.py:411]   },
D 11-11 22:15:26 provisioner.py:411]   "head_instance_id": "new-http-1-6eab-head-ameg8r3q-compute",
D 11-11 22:15:26 provisioner.py:411]   "provider_name": "gcp",
D 11-11 22:15:26 provisioner.py:411]   "provider_config": {
D 11-11 22:15:26 provisioner.py:411]     "type": "external",
D 11-11 22:15:26 provisioner.py:411]     "module": "sky.provision.gcp",
D 11-11 22:15:26 provisioner.py:411]     "region": "us-central1",
D 11-11 22:15:26 provisioner.py:411]     "availability_zone": "us-central1-a",
D 11-11 22:15:26 provisioner.py:411]     "cache_stopped_nodes": true,
D 11-11 22:15:26 provisioner.py:411]     "project_id": "psychic-order-437203-r7",
D 11-11 22:15:26 provisioner.py:411]     "firewall_rule": "sky-ports-new-http-1-6eab",
D 11-11 22:15:26 provisioner.py:411]     "use_internal_ips": false,
D 11-11 22:15:26 provisioner.py:411]     "force_enable_external_ips": false,
D 11-11 22:15:26 provisioner.py:411]     "disable_launch_config_check": true,
D 11-11 22:15:26 provisioner.py:411]     "use_managed_instance_group": false
D 11-11 22:15:26 provisioner.py:411]   },
D 11-11 22:15:26 provisioner.py:411]   "docker_user": null,
D 11-11 22:15:26 provisioner.py:411]   "ssh_user": null,
D 11-11 22:15:26 provisioner.py:411]   "custom_ray_options": null
D 11-11 22:15:26 provisioner.py:411] }
D 11-11 22:15:26 provisioner.py:436] 
D 11-11 22:15:26 provisioner.py:436] Waiting for SSH to be available for 'new-http-1' ...
D 11-11 22:15:26 provisioner.py:326] Waiting for SSH using command: ssh -T -i '~/.ssh/sky-key' gcpuser@34.28.7.77 -p 22 -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ConnectTimeout=10s -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o AddKeysToAgent=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 uptime
D 11-11 22:15:27 provisioner.py:380] Retrying in 1 second...
D 11-11 22:15:28 instance_utils.py:886] Waiting GCP instances to be ready ...
D 11-11 22:15:28 provisioner.py:439] SSH Connection ready for 'new-http-1'
I 11-11 22:15:28 provisioner.py:445] └── Instance is up.
I 11-11 22:15:28 common.py:292] 
I 11-11 22:15:28 common.py:292] --------------------Start: internal_file_mounts --------------------
D 11-11 22:15:28 instance_setup.py:502] Using 4 workers for file mounts.
sending incremental file list
./
026030db-0ae9-43d1-8bf8-633fd5cec802

            400 100%    0.00kB/s    0:00:00  
            400 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=15/17)
7877c1c8-e359-4fe7-989b-8774dd6a2394

              7 100%    6.84kB/s    0:00:00  
              7 100%    6.84kB/s    0:00:00 (xfr#2, to-chk=14/17)
8338f652-9425-4705-b38e-f381ead91a20

         12,288 100%   11.72MB/s    0:00:00  
         12,288 100%   11.72MB/s    0:00:00 (xfr#3, to-chk=13/17)
840ad7e4-981a-463a-8e9a-2744a7489b2c

          6,458 100%    6.16MB/s    0:00:00  
          6,458 100%    6.16MB/s    0:00:00 (xfr#4, to-chk=12/17)
901cb809-4c90-4be2-a12c-341d2df42640

             27 100%   26.37kB/s    0:00:00  
             27 100%   26.37kB/s    0:00:00 (xfr#5, to-chk=11/17)
dd73c306-3b9d-4033-a939-5746ab18fba6

              5 100%    4.88kB/s    0:00:00  
              5 100%    4.88kB/s    0:00:00 (xfr#6, to-chk=10/17)
ebfa708b-6df5-41bc-9803-38cdb2d357b7

         11,139 100%   10.62MB/s    0:00:00  
         11,139 100%   10.62MB/s    0:00:00 (xfr#7, to-chk=9/17)
fccedb1e-6e78-4a6f-94cc-5be5bcc6023d

         12,288 100%   11.72MB/s    0:00:00  
         12,288 100%   11.72MB/s    0:00:00 (xfr#8, to-chk=8/17)
23b8be71-6652-44ed-8410-8dbe0b31a78e/
23b8be71-6652-44ed-8410-8dbe0b31a78e/skypilot-1.0.0.dev0-py3-none-any.whl

         32,768   3%   31.25MB/s    0:00:00  
      1,009,784 100%  107.00MB/s    0:00:00 (xfr#9, to-chk=4/17)
a4064268-89e0-4e4e-9652-d183ccf74761/
a4064268-89e0-4e4e-9652-d183ccf74761/config_default

             75 100%    8.14kB/s    0:00:00  
             75 100%    8.14kB/s    0:00:00 (xfr#10, to-chk=3/17)
eb4d8cf0-ff88-4743-a921-2fd8cc6ef08b/
eb4d8cf0-ff88-4743-a921-2fd8cc6ef08b/zhifei.li@berkeley.edu/
eb4d8cf0-ff88-4743-a921-2fd8cc6ef08b/zhifei.li@berkeley.edu/.boto

            245 100%   23.93kB/s    0:00:00  
            245 100%   23.93kB/s    0:00:00 (xfr#11, to-chk=1/17)
eb4d8cf0-ff88-4743-a921-2fd8cc6ef08b/zhifei.li@berkeley.edu/adc.json

            317 100%   30.96kB/s    0:00:00  
            317 100%   30.96kB/s    0:00:00 (xfr#12, to-chk=0/17)

sent 1,006,860 bytes  received 271 bytes  2,014,262.00 bytes/sec
total size is 1,053,033  speedup is 1.05
I 11-11 22:15:29 common.py:296] --------------------End:   internal_file_mounts --------------------
I 11-11 22:15:29 common.py:296] 
I 11-11 22:15:29 common.py:292] 
I 11-11 22:15:29 common.py:292] --------------------Start: setup_runtime_on_cluster --------------------
D 11-11 22:15:29 metadata_utils.py:66] Need to run stage setup_runtime_on_cluster on instance new-http-1-6eab-head-ameg8r3q-compute-0: True
D 11-11 22:15:29 instance_utils.py:431] Waiting GCP operation operation-1731363327154-626aa70179953-df7a788c-717f3da6 to be ready ...
Synchronizing state of unattended-upgrades.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable unattended-upgrades

Usage:
 kill [options]  [...]

Options:
  [...]            send signal to every  listed
 -, -s, --signal 
                        specify the  to be sent
 -q, --queue     integer value to be sent with the signal
 -l, --list=[]  list all signal names, or convert one to a name
 -L, --table            list all signal names in a nice table

 -h, --help     display this help and exit
 -V, --version  output version information and exit

For more details see kill(1).
# >>> conda initialize >>>
PATH=/home/gcpuser/miniconda3/bin:/home/gcpuser/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
Requirement already satisfied: setuptools<70 in ./skypilot-runtime/lib/python3.10/site-packages (65.5.0)
WARNING: Skipping skypilot as it is not installed.
Processing ./.sky/wheels/e5c76ef08caac752a65c61006aca332f/skypilot-1.0.0.dev0-py3-none-any.whl
Requirement already satisfied: PrettyTable>=2.0.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.12.0)
Requirement already satisfied: jsonschema in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (4.23.0)
Requirement already satisfied: filelock>=3.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.16.1)
Requirement already satisfied: networkx in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.4.2)
Requirement already satisfied: tabulate in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.9.0)
Requirement already satisfied: psutil in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (6.1.0)
Requirement already satisfied: colorama in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.4.6)
Requirement already satisfied: python-dotenv in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (1.0.1)
Requirement already satisfied: packaging in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (24.1)
Requirement already satisfied: jinja2>=3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.1.4)
Requirement already satisfied: wheel in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.44.0)
Requirement already satisfied: cryptography in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (43.0.3)
Requirement already satisfied: typing-extensions in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (4.12.2)
Requirement already satisfied: pulp in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.9.0)
Requirement already satisfied: requests in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.32.3)
Requirement already satisfied: pendulum in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.0.0)
Requirement already satisfied: click>=7.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (8.1.7)
Requirement already satisfied: rich in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (13.9.3)
Requirement already satisfied: pandas>=1.3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.2.3)
Requirement already satisfied: pyyaml!=5.4.*,>3.13 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (6.0.2)
Requirement already satisfied: cachetools in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (5.5.0)
Requirement already satisfied: google-api-python-client>=2.69.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.149.0)
Requirement already satisfied: google-cloud-storage in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.18.2)
Requirement already satisfied: grpcio!=1.48.0,<=1.51.3,>=1.42.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (1.51.3)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (5.28.3)
Requirement already satisfied: pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.9.2)
Requirement already satisfied: google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (2.22.0)
Requirement already satisfied: uritemplate<5,>=3.0.1 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (4.1.1)
Requirement already satisfied: httplib2<1.dev0,>=0.19.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.22.0)
Requirement already satisfied: google-auth-httplib2<1.0.0,>=0.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.2.0)
Requirement already satisfied: google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (2.35.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from jinja2>=3.0->skypilot==1.0.0.dev0) (3.0.2)
Requirement already satisfied: pytz>=2020.1 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2024.2)
Requirement already satisfied: numpy>=1.22.4 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2.1.2)
Requirement already satisfied: tzdata>=2022.7 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2024.2)
Requirement already satisfied: python-dateutil>=2.8.2 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2.9.0.post0)
Requirement already satisfied: wcwidth in ./skypilot-runtime/lib/python3.10/site-packages (from PrettyTable>=2.0.0->skypilot==1.0.0.dev0) (0.2.13)
Requirement already satisfied: pydantic-core==2.23.4 in ./skypilot-runtime/lib/python3.10/site-packages (from pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3->skypilot==1.0.0.dev0) (2.23.4)
Requirement already satisfied: annotated-types>=0.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3->skypilot==1.0.0.dev0) (0.7.0)
Requirement already satisfied: cffi>=1.12 in ./skypilot-runtime/lib/python3.10/site-packages (from cryptography->skypilot==1.0.0.dev0) (1.17.1)
Requirement already satisfied: google-crc32c<2.0dev,>=1.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (1.6.0)
Requirement already satisfied: google-resumable-media>=2.7.2 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (2.7.2)
Requirement already satisfied: google-cloud-core<3.0dev,>=2.3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (2.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (3.10)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (2.2.3)
Requirement already satisfied: certifi>=2017.4.17 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (2024.8.30)
Requirement already satisfied: charset-normalizer<4,>=2 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (3.4.0)
Requirement already satisfied: attrs>=22.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (24.2.0)
Requirement already satisfied: referencing>=0.28.4 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (0.35.1)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (2024.10.1)
Requirement already satisfied: rpds-py>=0.7.1 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (0.20.0)
Requirement already satisfied: time-machine>=2.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from pendulum->skypilot==1.0.0.dev0) (2.16.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./skypilot-runtime/lib/python3.10/site-packages (from rich->skypilot==1.0.0.dev0) (2.18.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from rich->skypilot==1.0.0.dev0) (3.0.0)
Requirement already satisfied: pycparser in ./skypilot-runtime/lib/python3.10/site-packages (from cffi>=1.12->cryptography->skypilot==1.0.0.dev0) (2.22)
Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (1.65.0)
Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (1.25.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in ./skypilot-runtime/lib/python3.10/site-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.4.1)
Requirement already satisfied: rsa<5,>=3.1.4 in ./skypilot-runtime/lib/python3.10/site-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (4.9)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in ./skypilot-runtime/lib/python3.10/site-packages (from httplib2<1.dev0,>=0.19.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (3.2.0)
Requirement already satisfied: mdurl~=0.1 in ./skypilot-runtime/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->skypilot==1.0.0.dev0) (0.1.2)
Requirement already satisfied: six>=1.5 in ./skypilot-runtime/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas>=1.3.0->skypilot==1.0.0.dev0) (1.16.0)
Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in ./skypilot-runtime/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.6.1)
Installing collected packages: skypilot
Successfully installed skypilot-1.0.0.dev0
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/log_monitor.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/log_monitor.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/worker.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/worker.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/dashboard/modules/job/cli.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/dashboard/modules/job/cli.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/autoscaler.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/autoscaler.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/command_runner.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/command_runner.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/resource_demand_scheduler.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/resource_demand_scheduler.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/updater.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/updater.py-v2.9.3.orig)
DefaultTasksMax=infinity
I 11-11 22:15:39 common.py:296] --------------------End:   setup_runtime_on_cluster --------------------
I 11-11 22:15:39 common.py:296] 
D 11-11 22:15:39 provisioner.py:518] Starting Ray on the entire cluster.
I 11-11 22:15:39 common.py:292] 
I 11-11 22:15:39 common.py:292] --------------------Start: start_ray_on_head_node --------------------
I 11-11 22:15:39 instance_setup.py:308] Running command on head node: $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) stop; unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; RAY_SCHEDULER_EVENTS=0 RAY_DEDUP_LOGS=0 RAY_worker_maximum_startup_concurrency=$(( 3 * $(nproc --all) )) $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) start --head --disable-usage-stats --port=6380 --dashboard-port=8266 --min-worker-port 11002 --object-manager-port=8076 --temp-dir=/tmp/ray_skypilot || exit 1;which prlimit && for id in $(pgrep -f raylet/raylet); do sudo prlimit --nofile=1048576:1048576 --pid=$id || true; done;$([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) -c 'import json, os; json.dump({"ray_port":6380, "ray_dashboard_port":8266}, open(os.path.expanduser("~/.sky/ray_port.json"), "w", encoding="utf-8"))';while `RAY_ADDRESS=127.0.0.1:6380 $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) status | grep -q "No cluster status."`; do sleep 0.5; echo "Waiting ray cluster to be initialized"; done;
2024-11-11 22:15:41,319 INFO scripts.py:1163 -- Did not find any active Ray processes.
D 11-11 22:15:43 instance_utils.py:431] Waiting GCP operation operation-1731363342206-626aa70fd460b-0904320b-7b2694c0 to be ready ...
2024-11-11 22:15:42,298 INFO usage_lib.py:423 -- Usage stats collection is disabled.
2024-11-11 22:15:42,298 INFO scripts.py:744 -- Local node IP: 10.128.0.19
2024-11-11 22:15:46,388 SUCC scripts.py:781 -- --------------------
2024-11-11 22:15:46,388 SUCC scripts.py:782 -- Ray runtime started.
2024-11-11 22:15:46,388 SUCC scripts.py:783 -- --------------------
2024-11-11 22:15:46,389 INFO scripts.py:785 -- Next steps
2024-11-11 22:15:46,389 INFO scripts.py:788 -- To add another node to this Ray cluster, run
2024-11-11 22:15:46,389 INFO scripts.py:791 --   ray start --address='10.128.0.19:6380'
2024-11-11 22:15:46,389 INFO scripts.py:800 -- To connect to this Ray cluster:
2024-11-11 22:15:46,389 INFO scripts.py:802 -- import ray
2024-11-11 22:15:46,389 INFO scripts.py:803 -- ray.init()
2024-11-11 22:15:46,389 INFO scripts.py:815 -- To submit a Ray job using the Ray Jobs CLI:
2024-11-11 22:15:46,389 INFO scripts.py:816 --   RAY_ADDRESS='http://127.0.0.1:8266' ray job submit --working-dir . -- python my_script.py
2024-11-11 22:15:46,389 INFO scripts.py:825 -- See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html 
2024-11-11 22:15:46,389 INFO scripts.py:829 -- for more information on submitting Ray jobs to the Ray cluster.
2024-11-11 22:15:46,389 INFO scripts.py:834 -- To terminate the Ray runtime, run
2024-11-11 22:15:46,389 INFO scripts.py:835 --   ray stop
2024-11-11 22:15:46,389 INFO scripts.py:838 -- To view the status of the cluster, use
2024-11-11 22:15:46,389 INFO scripts.py:839 --   ray status
2024-11-11 22:15:46,389 INFO scripts.py:843 -- To monitor and debug Ray, view the dashboard at 
2024-11-11 22:15:46,389 INFO scripts.py:844 --   127.0.0.1:8266
2024-11-11 22:15:46,389 INFO scripts.py:851 -- If connection to the dashboard fails, check your firewall settings and network configuration.
/usr/bin/prlimit
D 11-11 22:15:46 provisioner.py:69] 
D 11-11 22:15:46 provisioner.py:69] Waiting for instances of 'new-http-2' to be ready...
D 11-11 22:15:47 provisioner.py:89] Instances of 'new-http-2' are ready after 0 retries.
D 11-11 22:15:47 provisioner.py:92] 
D 11-11 22:15:47 provisioner.py:92] Provisioning 'new-http-2' took 26.70 seconds.
D 11-11 22:15:48 provisioner.py:575] 
D 11-11 22:15:48 provisioner.py:575] 
D 11-11 22:15:48 provisioner.py:575] ==================== System Setup After Provision ====================
D 11-11 22:15:48 provisioner.py:575] 
D 11-11 22:15:48 instance.py:36] handlers: []
Waiting ray cluster to be initialized
D 11-11 22:15:49 instance.py:47] handler_to_instances: defaultdict(, {: ['new-http-2-6eab-head-2l0pmb4r-compute']})
I 11-11 22:15:49 common.py:296] --------------------End:   start_ray_on_head_node --------------------
I 11-11 22:15:49 common.py:296] 
I 11-11 22:15:49 common.py:292] 
I 11-11 22:15:49 common.py:292] --------------------Start: start_skylet_on_head_node --------------------
I 11-11 22:15:49 instance_setup.py:434] Running command on head node: source ~/skypilot-runtime/bin/activate; $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) -m sky.skylet.attempt_skylet;
D 11-11 22:15:50 instance.py:36] handlers: []
Skylet is not running. Starting (version 8)...
I 11-11 22:15:51 common.py:296] --------------------End:   start_skylet_on_head_node --------------------
I 11-11 22:15:51 common.py:296] 
D 11-11 22:15:51 instance.py:47] handler_to_instances: defaultdict(, {: ['new-http-2-6eab-head-2l0pmb4r-compute']})
D 11-11 22:15:51 provisioner.py:411] Provision record:
D 11-11 22:15:51 provisioner.py:411] {
D 11-11 22:15:51 provisioner.py:411]   "provider_name": "gcp",
D 11-11 22:15:51 provisioner.py:411]   "region": "us-east1",
D 11-11 22:15:51 provisioner.py:411]   "zone": "us-east1-b",
D 11-11 22:15:51 provisioner.py:411]   "cluster_name": "new-http-2-6eab",
D 11-11 22:15:51 provisioner.py:411]   "head_instance_id": "new-http-2-6eab-head-2l0pmb4r-compute",
D 11-11 22:15:51 provisioner.py:411]   "resumed_instance_ids": [],
D 11-11 22:15:51 provisioner.py:411]   "created_instance_ids": [
D 11-11 22:15:51 provisioner.py:411]     "new-http-2-6eab-head-2l0pmb4r-compute"
D 11-11 22:15:51 provisioner.py:411]   ]
D 11-11 22:15:51 provisioner.py:411] }
D 11-11 22:15:51 provisioner.py:411] Cluster info:
D 11-11 22:15:51 provisioner.py:411] {
D 11-11 22:15:51 provisioner.py:411]   "instances": {
D 11-11 22:15:51 provisioner.py:411]     "new-http-2-6eab-head-2l0pmb4r-compute": [
D 11-11 22:15:51 provisioner.py:411]       {
D 11-11 22:15:51 provisioner.py:411]         "instance_id": "new-http-2-6eab-head-2l0pmb4r-compute",
D 11-11 22:15:51 provisioner.py:411]         "internal_ip": "10.142.0.44",
D 11-11 22:15:51 provisioner.py:411]         "external_ip": "35.243.165.54",
D 11-11 22:15:51 provisioner.py:411]         "tags": {
D 11-11 22:15:51 provisioner.py:411]           "skypilot-user": "andyl",
D 11-11 22:15:51 provisioner.py:411]           "use-managed-instance-group": "0",
D 11-11 22:15:51 provisioner.py:411]           "ray-cluster-name": "new-http-2-6eab",
D 11-11 22:15:51 provisioner.py:411]           "skypilot-cluster-name": "new-http-2-6eab",
D 11-11 22:15:51 provisioner.py:411]           "ray-node-type": "head",
D 11-11 22:15:51 provisioner.py:411]           "skypilot-head-node": "1"
D 11-11 22:15:51 provisioner.py:411]         },
D 11-11 22:15:51 provisioner.py:411]         "ssh_port": 22
D 11-11 22:15:51 provisioner.py:411]       }
D 11-11 22:15:51 provisioner.py:411]     ]
D 11-11 22:15:51 provisioner.py:411]   },
D 11-11 22:15:51 provisioner.py:411]   "head_instance_id": "new-http-2-6eab-head-2l0pmb4r-compute",
D 11-11 22:15:51 provisioner.py:411]   "provider_name": "gcp",
D 11-11 22:15:51 provisioner.py:411]   "provider_config": {
D 11-11 22:15:51 provisioner.py:411]     "type": "external",
D 11-11 22:15:51 provisioner.py:411]     "module": "sky.provision.gcp",
D 11-11 22:15:51 provisioner.py:411]     "region": "us-east1",
D 11-11 22:15:51 provisioner.py:411]     "availability_zone": "us-east1-b",
D 11-11 22:15:51 provisioner.py:411]     "cache_stopped_nodes": true,
D 11-11 22:15:51 provisioner.py:411]     "project_id": "psychic-order-437203-r7",
D 11-11 22:15:51 provisioner.py:411]     "firewall_rule": "sky-ports-new-http-2-6eab",
D 11-11 22:15:51 provisioner.py:411]     "use_internal_ips": false,
D 11-11 22:15:51 provisioner.py:411]     "force_enable_external_ips": false,
D 11-11 22:15:51 provisioner.py:411]     "disable_launch_config_check": true,
D 11-11 22:15:51 provisioner.py:411]     "use_managed_instance_group": false
D 11-11 22:15:51 provisioner.py:411]   },
D 11-11 22:15:51 provisioner.py:411]   "docker_user": null,
D 11-11 22:15:51 provisioner.py:411]   "ssh_user": null,
D 11-11 22:15:51 provisioner.py:411]   "custom_ray_options": null
D 11-11 22:15:51 provisioner.py:411] }
D 11-11 22:15:51 provisioner.py:436] 
D 11-11 22:15:51 provisioner.py:436] Waiting for SSH to be available for 'new-http-2' ...
D 11-11 22:15:51 provisioner.py:326] Waiting for SSH using command: ssh -T -i '~/.ssh/sky-key' gcpuser@35.243.165.54 -p 22 -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ConnectTimeout=10s -o UserKnownHostsFile=/dev/null -o IdentitiesOnly=yes -o AddKeysToAgent=yes -o ExitOnForwardFailure=yes -o ServerAliveInterval=5 -o ServerAliveCountMax=3 uptime
D 11-11 22:15:52 provisioner.py:380] Retrying in 1 second...
D 11-11 22:15:53 provisioner.py:439] SSH Connection ready for 'new-http-2'
I 11-11 22:15:53 provisioner.py:445] └── Instance is up.
I 11-11 22:15:53 common.py:292] 
I 11-11 22:15:53 common.py:292] --------------------Start: internal_file_mounts --------------------
D 11-11 22:15:53 instance_setup.py:502] Using 4 workers for file mounts.
sending incremental file list
./
07e7e894-7884-4083-9dda-d53e2ebe6dbd

         11,139 100%    0.00kB/s    0:00:00  
         11,139 100%    0.00kB/s    0:00:00 (xfr#1, to-chk=15/17)
268a14fd-6555-488f-a650-353965ea636a

          6,449 100%    6.15MB/s    0:00:00  
          6,449 100%    6.15MB/s    0:00:00 (xfr#2, to-chk=14/17)
767d47cc-670e-4f4b-9b40-aa64fb541687

              7 100%    6.84kB/s    0:00:00  
              7 100%    6.84kB/s    0:00:00 (xfr#3, to-chk=13/17)
90481bac-72ba-4712-b362-61f0b870ce5e

            400 100%  390.62kB/s    0:00:00  
            400 100%  390.62kB/s    0:00:00 (xfr#4, to-chk=12/17)
a17f4ef3-8c72-484f-aa58-50455a47f9ef

         12,288 100%   11.72MB/s    0:00:00  
         12,288 100%   11.72MB/s    0:00:00 (xfr#5, to-chk=11/17)
c201e5ed-4ee9-4fcb-b97a-e98f8e27495e

         12,288 100%   11.72MB/s    0:00:00  
         12,288 100%   11.72MB/s    0:00:00 (xfr#6, to-chk=10/17)
c583e893-6cba-4301-a36a-e31063021927

             27 100%   26.37kB/s    0:00:00  
             27 100%   26.37kB/s    0:00:00 (xfr#7, to-chk=9/17)
cfea925f-5169-467c-9dab-41dd90d56e05

              5 100%    4.88kB/s    0:00:00  
              5 100%    4.88kB/s    0:00:00 (xfr#8, to-chk=8/17)
5140c384-a352-4dc1-843d-1bf542eab747/
5140c384-a352-4dc1-843d-1bf542eab747/config_default

             75 100%   73.24kB/s    0:00:00  
             75 100%   73.24kB/s    0:00:00 (xfr#9, to-chk=4/17)
54cadb5f-2756-4c49-8d75-f54e278a90ea/
54cadb5f-2756-4c49-8d75-f54e278a90ea/skypilot-1.0.0.dev0-py3-none-any.whl

         32,768   3%   31.25MB/s    0:00:00  
      1,009,784 100%  120.38MB/s    0:00:00 (xfr#10, to-chk=3/17)
ac9cb0b3-fa7c-4eab-8083-e57608799455/
ac9cb0b3-fa7c-4eab-8083-e57608799455/zhifei.li@berkeley.edu/
ac9cb0b3-fa7c-4eab-8083-e57608799455/zhifei.li@berkeley.edu/.boto

            245 100%   29.91kB/s    0:00:00  
            245 100%   29.91kB/s    0:00:00 (xfr#11, to-chk=1/17)
ac9cb0b3-fa7c-4eab-8083-e57608799455/zhifei.li@berkeley.edu/adc.json

            317 100%   38.70kB/s    0:00:00  
            317 100%   38.70kB/s    0:00:00 (xfr#12, to-chk=0/17)

sent 1,006,850 bytes  received 275 bytes  2,014,250.00 bytes/sec
total size is 1,053,024  speedup is 1.05
I 11-11 22:15:54 common.py:296] --------------------End:   internal_file_mounts --------------------
I 11-11 22:15:54 common.py:296] 
I 11-11 22:15:54 common.py:292] 
I 11-11 22:15:54 common.py:292] --------------------Start: setup_runtime_on_cluster --------------------
D 11-11 22:15:54 metadata_utils.py:66] Need to run stage setup_runtime_on_cluster on instance new-http-2-6eab-head-2l0pmb4r-compute-0: True
Synchronizing state of unattended-upgrades.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install disable unattended-upgrades

Usage:
 kill [options]  [...]

Options:
  [...]            send signal to every  listed
 -, -s, --signal 
                        specify the  to be sent
 -q, --queue     integer value to be sent with the signal
 -l, --list=[]  list all signal names, or convert one to a name
 -L, --table            list all signal names in a nice table

 -h, --help     display this help and exit
 -V, --version  output version information and exit

For more details see kill(1).
# >>> conda initialize >>>
PATH=/home/gcpuser/miniconda3/bin:/home/gcpuser/miniconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
Requirement already satisfied: setuptools<70 in ./skypilot-runtime/lib/python3.10/site-packages (65.5.0)
WARNING: Skipping skypilot as it is not installed.
Processing ./.sky/wheels/e5c76ef08caac752a65c61006aca332f/skypilot-1.0.0.dev0-py3-none-any.whl
Requirement already satisfied: jinja2>=3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.1.4)
Requirement already satisfied: click>=7.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (8.1.7)
Requirement already satisfied: cryptography in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (43.0.3)
Requirement already satisfied: cachetools in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (5.5.0)
Requirement already satisfied: pendulum in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.0.0)
Requirement already satisfied: psutil in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (6.1.0)
Requirement already satisfied: PrettyTable>=2.0.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.12.0)
Requirement already satisfied: pandas>=1.3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.2.3)
Requirement already satisfied: requests in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.32.3)
Requirement already satisfied: tabulate in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.9.0)
Requirement already satisfied: typing-extensions in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (4.12.2)
Requirement already satisfied: wheel in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.44.0)
Requirement already satisfied: networkx in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.4.2)
Requirement already satisfied: pyyaml!=5.4.*,>3.13 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (6.0.2)
Requirement already satisfied: jsonschema in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (4.23.0)
Requirement already satisfied: python-dotenv in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (1.0.1)
Requirement already satisfied: pulp in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.9.0)
Requirement already satisfied: colorama in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (0.4.6)
Requirement already satisfied: rich in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (13.9.3)
Requirement already satisfied: filelock>=3.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (3.16.1)
Requirement already satisfied: packaging in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (24.1)
Requirement already satisfied: pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.9.2)
Requirement already satisfied: protobuf!=3.19.5,>=3.15.3 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (5.28.3)
Requirement already satisfied: grpcio!=1.48.0,<=1.51.3,>=1.42.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (1.51.3)
Requirement already satisfied: google-cloud-storage in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.18.2)
Requirement already satisfied: google-api-python-client>=2.69.0 in ./skypilot-runtime/lib/python3.10/site-packages (from skypilot==1.0.0.dev0) (2.149.0)
Requirement already satisfied: google-auth-httplib2<1.0.0,>=0.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.2.0)
Requirement already satisfied: google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (2.22.0)
Requirement already satisfied: uritemplate<5,>=3.0.1 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (4.1.1)
Requirement already satisfied: httplib2<1.dev0,>=0.19.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.22.0)
Requirement already satisfied: google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (2.35.0)
Requirement already satisfied: MarkupSafe>=2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from jinja2>=3.0->skypilot==1.0.0.dev0) (3.0.2)
Requirement already satisfied: tzdata>=2022.7 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2024.2)
Requirement already satisfied: numpy>=1.22.4 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2.1.2)
Requirement already satisfied: pytz>=2020.1 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2024.2)
Requirement already satisfied: python-dateutil>=2.8.2 in ./skypilot-runtime/lib/python3.10/site-packages (from pandas>=1.3.0->skypilot==1.0.0.dev0) (2.9.0.post0)
Requirement already satisfied: wcwidth in ./skypilot-runtime/lib/python3.10/site-packages (from PrettyTable>=2.0.0->skypilot==1.0.0.dev0) (0.2.13)
Requirement already satisfied: annotated-types>=0.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3->skypilot==1.0.0.dev0) (0.7.0)
Requirement already satisfied: pydantic-core==2.23.4 in ./skypilot-runtime/lib/python3.10/site-packages (from pydantic!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.*,!=2.4.*,<3->skypilot==1.0.0.dev0) (2.23.4)
Requirement already satisfied: cffi>=1.12 in ./skypilot-runtime/lib/python3.10/site-packages (from cryptography->skypilot==1.0.0.dev0) (1.17.1)
Requirement already satisfied: google-crc32c<2.0dev,>=1.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (1.6.0)
Requirement already satisfied: google-resumable-media>=2.7.2 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (2.7.2)
Requirement already satisfied: google-cloud-core<3.0dev,>=2.3.0 in ./skypilot-runtime/lib/python3.10/site-packages (from google-cloud-storage->skypilot==1.0.0.dev0) (2.4.1)
Requirement already satisfied: idna<4,>=2.5 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (3.10)
Requirement already satisfied: certifi>=2017.4.17 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (2024.8.30)
Requirement already satisfied: urllib3<3,>=1.21.1 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (2.2.3)
Requirement already satisfied: charset-normalizer<4,>=2 in ./skypilot-runtime/lib/python3.10/site-packages (from requests->skypilot==1.0.0.dev0) (3.4.0)
Requirement already satisfied: rpds-py>=0.7.1 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (0.20.0)
Requirement already satisfied: jsonschema-specifications>=2023.03.6 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (2024.10.1)
Requirement already satisfied: attrs>=22.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (24.2.0)
Requirement already satisfied: referencing>=0.28.4 in ./skypilot-runtime/lib/python3.10/site-packages (from jsonschema->skypilot==1.0.0.dev0) (0.35.1)
Requirement already satisfied: time-machine>=2.6.0 in ./skypilot-runtime/lib/python3.10/site-packages (from pendulum->skypilot==1.0.0.dev0) (2.16.0)
Requirement already satisfied: markdown-it-py>=2.2.0 in ./skypilot-runtime/lib/python3.10/site-packages (from rich->skypilot==1.0.0.dev0) (3.0.0)
Requirement already satisfied: pygments<3.0.0,>=2.13.0 in ./skypilot-runtime/lib/python3.10/site-packages (from rich->skypilot==1.0.0.dev0) (2.18.0)
Requirement already satisfied: pycparser in ./skypilot-runtime/lib/python3.10/site-packages (from cffi>=1.12->cryptography->skypilot==1.0.0.dev0) (2.22)
Requirement already satisfied: proto-plus<2.0.0dev,>=1.22.3 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (1.25.0)
Requirement already satisfied: googleapis-common-protos<2.0.dev0,>=1.56.2 in ./skypilot-runtime/lib/python3.10/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0.dev0,>=1.31.5->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (1.65.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in ./skypilot-runtime/lib/python3.10/site-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.4.1)
Requirement already satisfied: rsa<5,>=3.1.4 in ./skypilot-runtime/lib/python3.10/site-packages (from google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (4.9)
Requirement already satisfied: pyparsing!=3.0.0,!=3.0.1,!=3.0.2,!=3.0.3,<4,>=2.4.2 in ./skypilot-runtime/lib/python3.10/site-packages (from httplib2<1.dev0,>=0.19.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (3.2.0)
Requirement already satisfied: mdurl~=0.1 in ./skypilot-runtime/lib/python3.10/site-packages (from markdown-it-py>=2.2.0->rich->skypilot==1.0.0.dev0) (0.1.2)
Requirement already satisfied: six>=1.5 in ./skypilot-runtime/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas>=1.3.0->skypilot==1.0.0.dev0) (1.16.0)
Requirement already satisfied: pyasn1<0.7.0,>=0.4.6 in ./skypilot-runtime/lib/python3.10/site-packages (from pyasn1-modules>=0.2.1->google-auth!=2.24.0,!=2.25.0,<3.0.0.dev0,>=1.32.0->google-api-python-client>=2.69.0->skypilot==1.0.0.dev0) (0.6.1)
Installing collected packages: skypilot
Successfully installed skypilot-1.0.0.dev0
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/log_monitor.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/log_monitor.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/worker.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/_private/worker.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/dashboard/modules/job/cli.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/dashboard/modules/job/cli.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/autoscaler.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/autoscaler.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/command_runner.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/command_runner.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/resource_demand_scheduler.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/resource_demand_scheduler.py-v2.9.3.orig)
patching file /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/updater.py (read from /home/gcpuser/skypilot-runtime/lib/python3.10/site-packages/ray/autoscaler/_private/updater.py-v2.9.3.orig)
DefaultTasksMax=infinity
I 11-11 22:16:03 common.py:296] --------------------End:   setup_runtime_on_cluster --------------------
I 11-11 22:16:03 common.py:296] 
D 11-11 22:16:03 provisioner.py:518] Starting Ray on the entire cluster.
I 11-11 22:16:03 common.py:292] 
I 11-11 22:16:03 common.py:292] --------------------Start: start_ray_on_head_node --------------------
I 11-11 22:16:03 instance_setup.py:308] Running command on head node: $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) stop; unset AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY; RAY_SCHEDULER_EVENTS=0 RAY_DEDUP_LOGS=0 RAY_worker_maximum_startup_concurrency=$(( 3 * $(nproc --all) )) $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) start --head --disable-usage-stats --port=6380 --dashboard-port=8266 --min-worker-port 11002 --object-manager-port=8076 --temp-dir=/tmp/ray_skypilot || exit 1;which prlimit && for id in $(pgrep -f raylet/raylet); do sudo prlimit --nofile=1048576:1048576 --pid=$id || true; done;$([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) -c 'import json, os; json.dump({"ray_port":6380, "ray_dashboard_port":8266}, open(os.path.expanduser("~/.sky/ray_port.json"), "w", encoding="utf-8"))';while `RAY_ADDRESS=127.0.0.1:6380 $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) $([ -s ~/.sky/ray_path ] && cat ~/.sky/ray_path 2> /dev/null || which ray) status | grep -q "No cluster status."`; do sleep 0.5; echo "Waiting ray cluster to be initialized"; done;
2024-11-11 22:16:04,487 INFO scripts.py:1163 -- Did not find any active Ray processes.
2024-11-11 22:16:05,243 INFO usage_lib.py:423 -- Usage stats collection is disabled.
2024-11-11 22:16:05,244 INFO scripts.py:744 -- Local node IP: 10.142.0.44
2024-11-11 22:16:08,987 SUCC scripts.py:781 -- --------------------
2024-11-11 22:16:08,987 SUCC scripts.py:782 -- Ray runtime started.
2024-11-11 22:16:08,987 SUCC scripts.py:783 -- --------------------
2024-11-11 22:16:08,988 INFO scripts.py:785 -- Next steps
2024-11-11 22:16:08,988 INFO scripts.py:788 -- To add another node to this Ray cluster, run
2024-11-11 22:16:08,988 INFO scripts.py:791 --   ray start --address='10.142.0.44:6380'
2024-11-11 22:16:08,988 INFO scripts.py:800 -- To connect to this Ray cluster:
2024-11-11 22:16:08,988 INFO scripts.py:802 -- import ray
2024-11-11 22:16:08,988 INFO scripts.py:803 -- ray.init()
2024-11-11 22:16:08,988 INFO scripts.py:815 -- To submit a Ray job using the Ray Jobs CLI:
2024-11-11 22:16:08,988 INFO scripts.py:816 --   RAY_ADDRESS='http://127.0.0.1:8266' ray job submit --working-dir . -- python my_script.py
2024-11-11 22:16:08,988 INFO scripts.py:825 -- See https://docs.ray.io/en/latest/cluster/running-applications/job-submission/index.html 
2024-11-11 22:16:08,988 INFO scripts.py:829 -- for more information on submitting Ray jobs to the Ray cluster.
2024-11-11 22:16:08,988 INFO scripts.py:834 -- To terminate the Ray runtime, run
2024-11-11 22:16:08,988 INFO scripts.py:835 --   ray stop
2024-11-11 22:16:08,988 INFO scripts.py:838 -- To view the status of the cluster, use
2024-11-11 22:16:08,988 INFO scripts.py:839 --   ray status
2024-11-11 22:16:08,988 INFO scripts.py:843 -- To monitor and debug Ray, view the dashboard at 
2024-11-11 22:16:08,988 INFO scripts.py:844 --   127.0.0.1:8266
2024-11-11 22:16:08,988 INFO scripts.py:851 -- If connection to the dashboard fails, check your firewall settings and network configuration.
/usr/bin/prlimit
Waiting ray cluster to be initialized
Waiting ray cluster to be initialized
I 11-11 22:16:13 common.py:296] --------------------End:   start_ray_on_head_node --------------------
I 11-11 22:16:13 common.py:296] 
I 11-11 22:16:13 common.py:292] 
I 11-11 22:16:13 common.py:292] --------------------Start: start_skylet_on_head_node --------------------
I 11-11 22:16:13 instance_setup.py:434] Running command on head node: source ~/skypilot-runtime/bin/activate; $([ -s ~/.sky/python_path ] && cat ~/.sky/python_path 2> /dev/null || which python3) -m sky.skylet.attempt_skylet;
Skylet is not running. Starting (version 8)...
I 11-11 22:16:13 common.py:296] --------------------End:   start_skylet_on_head_node --------------------
I 11-11 22:16:13 common.py:296] 
I 11-11 22:16:02 execution.py:301] ⚙︎ Mounting files.
I 11-11 22:16:06 backend_utils.py:1222]   Syncing (to 1 node): gs://skypilot-workdir-andyl-cb79acd2 -> ~/sky_workdir
I 11-11 22:16:16 cloud_vm_ray_backend.py:3360] ⚙︎ Job submitted, ID: 1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] 
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] Job ID: 1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] 📋 Useful Commands
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] ├── To cancel the job:       sky cancel new-http-1 1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] ├── To stream job logs:      sky logs new-http-1 1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3396] └── To view job queue:       sky queue new-http-1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] 
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] Cluster name: new-http-1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] ├── To log into the head VM: ssh new-http-1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] ├── To submit a job:     sky exec new-http-1 yaml_file
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] ├── To stop the cluster: sky stop new-http-1
I 11-11 22:16:16 cloud_vm_ray_backend.py:3489] └── To teardown the cluster: sky down new-http-1

[?25hI 11-11 22:16:16 replica_managers.py:104] Replica cluster new-http-1 launched.
Start streaming logs for task job of replica 1...
Job ID not provided. Streaming the logs of the latest job.
├── Waiting for task resources on 1 node.
└── Job started. Streaming logs... (Ctrl-C to exit log streaming; job will not be killed)
(new-http, pid=2602) serving at port 8080
(new-http, pid=2602) 34.72.251.229 - - [11/Nov/2024 22:16:20] "GET /health HTTP/1.1" 200 -
andylizf commented 1 week ago

@cblmemo PTAL, thanks!

andylizf commented 1 week ago

@cblmemo PTAL, thanks!