Closed patrickmetzner closed 1 year ago
It's possible that this is related to another issue regarding re-connecting to Custom VMs that have been stopped and started. There are similar observations there regarding the status of k_default
.
Thanks - we're looking into this. Duplicate of #3714
In the meantime you can try running our Docker container in a VM of your choosing (option 1 here): https://research.google.com/colaboratory/local-runtimes.html
@cperry-goog thank you for the update. Appreciate the team looking into these issues! Any guidance for recovering data from machines that are experiencing these issues?
Also, I tried to follow the directions you pointed to here but was unsuccessful in connecting Colab to my problem VM. I also repeated the instructions with a fresh Colab VM that I confirmed to work using the "Connect to a custom GCE VM" option, but couldn't get the "Connect to a local runtime" to work with that one either.
A possible typo in the instructions surrounding local VMs. The command in "Connecting to a runtime on another machine" section uses port 8888
on remote while I think the docker image connects to port 8080
. Unfortunately, I tried switching the suggested ports around, but still couldn't get things to work.
Connecting via Docker should be done with the Connect to a local runtime
flow. What error do you encounter when you try that?
In a terminal window I've tried running:
Step 1.
(local) gcloud compute ssh --zone "us-west1-b" "colab-5-vm" --project "my-project" -- -L 8888:localhost:8888
Step 2.
(remote) docker run -p 127.0.0.1:9000:8888 us-docker.pkg.dev/colab-images/public/runtime
which generates the following link, http://127.0.0.1:9000/?token=some0token0here
.
Step 3.
(local, chrome browser) On Colab, in the "backend URL" field, I type: http://localhost:8888/?token=some0token0here
. Alternatively I've replaced localhost:8888
with all combos of {127.0.0.1, localhost}:{8888,9000}
.
On terminal I see channel 4: open failed: connect failed: Connection refused
and in Colab I see "Unable to connect to the runtime". Apologies if I'm misunderstanding how to organize the ports, and thank you for your help!
A possible typo in the instructions surrounding local VMs. [...] gcloud compute ssh --zone "us-west1-b" "colab-5-vm" --project "my-project" -- -L 8888:localhost:8888
Port 8888 is given as an example in the doc ("For example, to forward port 8888 on your local machine to port 8888 on your Google Compute Engine instance").
When using the docker command (corrected from what you used, below), the port on your VM will be 9000, so use that. Using it also as the local listening port (i.e, end up with -L 9000:localhost:9000
) should allow you to use the URL obtained in your Step 2, in Step 3. We can make this all a bit more clear, perhaps use port 9000 on the VM (and on the local end when port forwarding), for both (docker, Jupyter) options.
(remote) docker run -p 127.0.0.1:9000:8888 us-docker.pkg.dev/colab-images/public/runtime which generates the following link, http://127.0.0.1:9000/?token=some0token0here.
The doc tells you to use container port 8080, not 8888:
docker run -p 127.0.0.1:9000:8080 us-docker.pkg.dev/colab-images/public/runtime
Problem connecting Colab notebook to Google Compute Engine VM instance using the "Connect to a custom GCE VM" option
Current behavior: After creating a VM instance with GCE, I am unable to connect a Colab Notebook to it. Looking at https://console.cloud.google.com/compute/instances it seems like the VM starts without problems, and I can also access it via SSH, but when I try to connect a Colab Notebook to it I receive the following message instantly:
Unable to connect to the runtime.
I am experiencing this problem since May 26th, 2023.
Note: I have noticed this same problem on November 4th, 2022 and after 2-3 days everything went back to normal without any action from my part.
Expected behavior: I have been using the same machine image to create VM instances for over 6 months, and I have always been able to connect Colab Notebooks to them by using the
Connect to a custom GCE VM
option. This is the expected behavior.Browser being used: Google Chrome
Additional context: Running
docker ps
via SSH I get the following output:Running
sudo systemctl status k_default
via SSH I get the following output:If I run
docker rm k_default -f
followed bysudo systemctl start k_default
the container starts to run and I can successfully runnvidia-smi
inside it, but that is still not enough to enable the connection with the Colab Notebooks.