googlecolab / colabtools

Python libraries for Google Colaboratory
Apache License 2.0
2.21k stars 725 forks source link

google.colab.auth is unsupported in this environment for custom GCE VM runtime #2533

Open wcwong opened 2 years ago

wcwong commented 2 years ago

After deploying a custom GCE VM runtime as per the instructions at https://research.google.com/colaboratory/marketplace.html and connecting, when trying to use the following code

from google.colab import auth
auth.authenticate_user()

I get the following error

NotImplementedError                       Traceback (most recent call last)
<ipython-input-1-1f759c1655bd> in <module>()
      1 from google.colab import auth
----> 2 auth.authenticate_user()

/usr/local/lib/python3.7/dist-packages/google/colab/auth.py in authenticate_user(clear_output)
    144   """
    145   if _os.path.exists('/var/colab/mp'):
--> 146     raise NotImplementedError(__name__ + ' is unsupported in this environment.')
    147   if _check_adc():
    148     return

NotImplementedError: google.colab.auth is unsupported in this environment.

My expectation was that the GCE VM deployed from the marketplace would have the same software environment as the standard runtime but also give me the ability to specify the compute/memory/gpu resources that are avaialble to my GCP project. As such, I was not expecting to need to make code changes to the notebook for it to work on the marketplace GCE VM.

cperry-goog commented 2 years ago

This is on our radar, apologies for the friction. We don't support auth.authenticate_user() today for a few reasons, we're tracking a fix at b/207007587

metehanpinarli commented 2 years ago

how do i connect to colab with private GCE server.

flyosity commented 2 years ago

@cperry-goog Any updates on this? I launched an A100 instance with Google Colab VM specifically to use my Colab Notebook I was using on the Colab Pro + account I was paying for, but on beefier hardware, but can't connect to Drive so it's useless.

cmtg commented 2 years ago

Is there a workaround? It would be really handy if results from a Colab Notebook could be saved to Drive.

blois commented 2 years ago

Keep in mind that a custom GCE VM will be accessible to all users who have access to VMs within that project. Because of this you need to be careful about putting credentials on the VM- they will be accessible to everyone with access to that VM.

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

An alternative is to use something such as https://github.com/astrada/google-drive-ocamlfuse-

wcwong commented 2 years ago

@blois - I'm a little surprised that anyone who has access to the project has access to notebook data by default. My assumption was that the environment ran in its own container with each different user connecting being given their own containers and their own container local storage. Isn't that how it works in the hosted environment?

I guess that's mostly the crux of my confusion. If there are sufficient environmental protections in place for the hosted environment, why isn't a project security boundary considered equivalent? How is this different than any other project level security boundaries in GCP?

Specifically, doesn't the workaround you describe also put the credentials on the VM? And with FUSE can't anyone in the project, by default, ssh to the VM, then sudo su to the user and have access to the FUSE drive? So this doesn't materially change the security posture?

alexandrnikitin commented 2 years ago

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

It's not necessary a gmail login. It can be a service account (the one the VM has access to). Why not to support it for better UX?

thanawatn commented 2 years ago

This is on our radar, apologies for the friction. We don't support auth.authenticate_user() today for a few reasons, we're tracking a fix at b/207007587

You can add in your policy to comply users with agreement and allow google.colab.auth for those who use custom GCE VM runtime. Also it would be nice if it's available in google cloud function na kub.

EV3RETH commented 2 years ago

Has this been resolved? Colab Pro + only ever gives me P100s so I upgraded to a A100 with GCE Vms but now I can't access all my google drive files.

reinisindans commented 2 years ago

Was using ocamlfuse solution to access my Drive, but that has just stopped working too. Have to look for an alternative solution, again. I hope this issue gets adressed, using drive for data storage was quite convenient for smaller personal and research projects.

jmilagroso commented 2 years ago

how do i connect to colab with private GCE server.

https://research.google.com/colaboratory/marketplace.html

RayH1975 commented 2 years ago

@cperry-goog Any updates on this? I launched an A100 instance with Google Colab VM specifically to use my Colab Notebook I was using on the Colab Pro + account I was paying for, but on beefier hardware, but can't connect to Drive so it's useless.

itsuzef commented 2 years ago

Any updates on this? Trying to connect a custom GCE VM, but it is an unsupported environment

blois commented 2 years ago

https://github.com/googlecolab/colabtools/issues/2533#issuecomment-1018080844 is still the current status.

nicoleitte commented 2 years ago

If we're connecting to the custom GCE VM through a locally-hosted runtime (via port-forwarding), there's no way to install omcamlfuse, since terminal functionality is disabled.

djgish485 commented 2 years ago

What's the point of using Colab if we can't use beefier hardware? Any recommendations for alternative services?

xmalina-aibuild commented 2 years ago

It's september. This still hasn't been resolved? Very disappointed.... we just upgraded for the same reasons and got caught by this bug.

seb-tc commented 2 years ago

Hey everyone, I'm just as confused and annoyed at the lack of Google Drive integration with GCE. I hope we find a fix soon.

alexandrnikitin commented 2 years ago

Keep in mind that a custom GCE VM will be accessible to all users who have access to VMs within that project. Because of this you need to be careful about putting credentials on the VM- they will be accessible to everyone with access to that VM.

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

I mentioned it already in the thread and will do it again. If the only concern is that Google Drive creds/tokens will be accessible to everyone who has access to that VM then we can use a service account.

  1. VM get a service account associated with it
  2. One goes to Google Drive and explicitly shares "Colab Notebooks" and any other folder with the service account
  3. google.colab.auth() knows to auth and access the drive using the service account
DasDominus commented 2 years ago

It would be nice there is a bypass/opt-out. Not everyone cares about data privacy that much. Our lab for example has all of our data in a shared space (within our lab of course). But essentially anyone has access to the VM and google account, should also have the access to data/drive.

I mean google-drive-ocamlfuse works. but I'd expect it to work out of box.

At least have a prompt when trying to mount etc

tomasyany commented 2 years ago

Any update on this issue?

doshik commented 1 year ago

Scrolled through this thread hoping for a solution and was met with disappointment..

ruidi-huang commented 1 year ago

Disappointment in 2023...

alexandrnikitin commented 1 year ago

@cperry-goog @blois Any updates on the issue? What is the status of b/207007587?

corngk commented 1 year ago

After a year and two months of waiting, any update on this issue?

iamjakob commented 1 year ago

Any updates? Paid for custom GCE VM and immediately regretted.

caioflexa commented 1 year ago

Any updates?

mauricio-repetto commented 1 year ago

I came here because I'm facing the same issue... unbelievable that there's no updates on this yet.

kurshakuz commented 1 year ago

@cperry-goog are there any updates on that issue?

pjspol commented 1 year ago

Same issue. Hoping for an update!

chriscast88 commented 1 year ago

Hi all! I wanted to share the solution that has been working for me since it seems that this has been an ongoing issue for a lot of people.

I've been using google-drive-ocamlfuse to mount my gDrive on a custom GCE VM. The process is a bit involved and not the most elegant, but it works.

First you'll need to create a new project and OAuth credentials via the API Console. The key here is that we'll need to set it up for Headless Usage since Google Colab doesn't have a web browser.

Follow the steps here on ocamlfuse's documentation to setup Headless Usage HERE and this should give you API access to your Drive, with a client ID and secret key.

Once you have your client ID and secret key setup, you can install ocamlfuse with the following command

!sudo add-apt-repository ppa:alessandro-strada/ppa
!sudo apt-get update
!sudo apt-get install google-drive-ocamlfuse

and then you should be able to now mount your drive with this

!google-drive-ocamlfuse -headless -label me -id ##yourClientID##.apps.googleusercontent.com -secret ###yoursecret##### 

which should then show you something similar to this

   Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=##yourClientID##.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force

which will take you to a credential page and you can copy and paste your key

   Please enter the verification code: 

And that's basically it! You should be able to mount your drive with the code below

 !mkdir -p /content/drive/MyDrive
 !google-drive-ocamlfuse /content/drive/MyDrive

The only thing is that this has made things cumbersome for when I just have a single notebook that I like to run on either a hosted runtime or GCE VM, so I've made the code below in order to determine whether or not it's on a GCE VM, install ocamlfuse if needed, and mount the drive the old fashioned way, or with ocamlfuse. I pretty much have this code block on all of my notebooks now. Hope this helps!!! Just make sure to replace your client ID and secret keys

#Mount Google Drive
import re
import os

version = !cat /proc/version

if re.search("gce", version[0]):
    print("Session is connected to a custom GCE VM, running ocamlfuse")
    # Check if ocamlfuse is installed
    if 'google-drive-ocamlfuse' in os.popen('pip freeze').read():
        print("ocamlfuse is already installed, mounting...")
    else:
        # If not installed, install it
        print("ocamlfuse is not installed, installing...")
        #!pip install ocamlfuse
        !sudo add-apt-repository ppa:alessandro-strada/ppa
        !sudo apt-get update
        !sudo apt-get install google-drive-ocamlfuse
    # Is anything already mounted? Let's jiggle the handle
    !umount /content/drive/MyDrive
    !rm -rf ~/.gdfuse/default
    !rm -rf /content/drive/MyDrive
    !mkdir -p /content/drive/MyDrive

    # Mount with ocamlfuse
    !google-drive-ocamlfuse -headless -id REPLACE_CLIENT_ID_HERE.apps.googleusercontent.com -secret REPLACE_SECRET_KEY_HERE
    !google-drive-ocamlfuse /content/drive/MyDrive

else:
    print("Session is connected to a hosted runtime, running Google Auth")
    from google.colab import drive
    drive.mount('/content/drive')
Great-Bucket commented 1 year ago

Hi chriscast88,

Thanks for posting this. After following your guide, I ran this code: !mkdir -p /content/drive/MyDrive !google-drive-ocamlfuse /content/drive/MyDrive

But got this error: /usr/bin/xdg-open: 869: www-browser: not found /usr/bin/xdg-open: 869: links2: not found /usr/bin/xdg-open: 869: elinks: not found /usr/bin/xdg-open: 869: links: not found /usr/bin/xdg-open: 869: lynx: not found /usr/bin/xdg-open: 869: w3m: not found xdg-open: no method available for opening 'https://accounts.google.com/o/oauth2/auth?client_id=XXXXXXXXXXXX.apps.googleusercontent.com&redirect_uri=httpsXXXXXXFgd-ocaml-auth.appspot.com%2Foauth2callback&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force&state=XXXXXXXXXXXXXXXXXXXXX' /bin/sh: 1: firefox: not found /bin/sh: 1: google-chrome: not found /bin/sh: 1: chromium-browser: not found /bin/sh: 1: open: not found Cannot retrieve auth tokens. Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=XXXXXXXXXXXX.apps.googleusercontent.com&redirect_uri=httpsXXXXXXXXFgd-ocaml-auth.appspot.com%2Foauth2callback&scope=httpsXXXXXXXXXXXwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force&state=9XXXXXXXXXXXXXXXXXXX"

If anyone has any advice, I'd appreciate it?

Thanks!

divyanshsinghvi commented 1 year ago

!google-drive-ocamlfuse -headless -id REPLACE_CLIENT_ID_HERE.apps.googleusercontent.com -secret REPLACE_SECRET_KEY_HERE google-drive-ocamlfuse /content/drive/MyDrive

Try this?

SishaarRao commented 1 year ago

For my use case, using a Google Storage Bucket as the backing datastore was an equivalent option to Google Drive. It's very straightforward to connect to a bucket with the following code (utilizing gcsfuse)

### MOUNT GOOGLE STORAGE BUCKET
from google.colab import auth
auth.authenticate_user()

!echo "deb https://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir -p mounted-bucket
!gcsfuse --implicit-dirs audio-model-data mounted-bucket
BASE_PATH = "/content/mounted-bucket"
drkousek commented 11 months ago

still no fix? Thanks @chriscast88 for solution, worked flawlessly even for Shareddrives with little bit of tweaking.

deanp70 commented 8 months ago

Posting with permission from @cperry-goog - we're collaborating with the Colab team to provide DagsHub Storage as an alternative to GDrive that is more scalable and built for use with large datasets. It's an S3-compatible bucket that has much simpler access controls, and can be mounted easily.

It might help avoid the issues above - here's a link to an example notebook to try it out

We're looking for community feedback, so I'd love to get your input if it helps with the issue at hand.

(If you're curious, DagsHub is a platform for ML teams which is why we think Colab should have a storage solution suitable for ML workloads)

nonlin commented 5 months ago

For my use case, using a Google Storage Bucket as the backing datastore was an equivalent option to Google Drive. It's very straightforward to connect to a bucket with the following code (utilizing gcsfuse)

### MOUNT GOOGLE STORAGE BUCKET
from google.colab import auth
auth.authenticate_user()

!echo "deb https://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir -p mounted-bucket
!gcsfuse --implicit-dirs audio-model-data mounted-bucket
BASE_PATH = "/content/mounted-bucket"

Umm doesn't this run into the same issue being that "google.colab is unsupported in this environment."

nishchay-veer commented 2 months ago

How can I change my google colab compute engine?

luckandrew commented 1 month ago

auth.authenticate_user()

Still a problem after 2 years...This took time and $

Jahetthana commented 4 weeks ago

https://research.google.com/colaboratory/faq.html#disallowed-activities

pranavhh commented 5 days ago

@cperry-goog, Any updates on this?