googlecolab / colabtools

Python libraries for Google Colaboratory
Apache License 2.0
2.17k stars 701 forks source link

Can I host my own collaboratory instance? #266

Open alexflint opened 5 years ago

alexflint commented 5 years ago

I'd like to host my own instance of collaboratory as an onprem service for my company. Is this supported / possible?

I understand that collaboratory already provides the ability to connect to a locally running python kernel. This is not really what I want. Instead, I want to provide a single pool of python kernels that engineers at my company can jointly use. But these need to be located within our network because my use case requires access to some internal services that are not accessible from outside our VPN.

Is this possible?

jefflarkin commented 5 years ago

I'm looking for the same thing. I need to be able to connect to something other than a google-hosted runtime and a "local" runtime (ssh forwarding isn't an option). I really like colab's interface and ability to save to gdrive, but I need a way to self-host so that I can point to a different runtime. Seems like it should be possible, but I've not figured it out yet.

colaboratory-team commented 5 years ago

Right now, the only path to supporting a custom backend is to use local runtimes.

Please do leave upvotes and descriptions of your use-cases. These are helpful for planning future work. We'll keep this issue updated.

jefflarkin commented 5 years ago

@colaboratory-team Use case: I need to provide a runtime that includes licensed software (therefore, unable to be installed on the hosted runtime) to a group of students (who cannot SSH port-forward to the runtime machine). I can make this work with standard jupyter, but find the colab interface simpler to use and love the ability to save off the results to gdrive or github.

crypdick commented 5 years ago

@colaboratory-team I am working with a team of data scientists with confidential medical data. The data owner required us to use security-hardened machines and uploading any of our data to the cloud is out of the question.

colaboratory-team commented 5 years ago

Makes sense. Thanks for the replies.

For future travelers: if you notice that a previously described use-case matches your own, please upvote it. If you have a new one, please leave a reply. We do use GitHub among other signals to prioritize future work.

zachmoshe commented 5 years ago

Just to add on that: In my use-case we must run the backend in-house because of data issues. Working locally isn't an option too. We're currently trying to run the backend on our servers and connect with local backend (tunnel) but:

  1. It's not very convenient. Everyone needs to run his own tunnel to the server.
  2. Ideally I'd like to have 2 (or more) backends running on the server, show the user a list (like in the hosted mode) and let him choose between them.
  3. I'd like to set the backend hostname somehow, so it will automatically go to 'colab.mydomain.com' rather than 'localhost' (basically creating a "custom hosted mode").

Other than that, it's an awesome tool and a huge improvement over the regular iPython notebook server.

jefflarkin commented 5 years ago

@colaboratory-team It's been nearly a year since your last reply on this issue. Is there any update to this request?

georghildebrand commented 5 years ago

I would also be very interested in using colab on prem or other cloud provides (eg. aws)

valvesss commented 4 years ago

Some update on this guys?

bterwijn commented 4 years ago

You can already do this with a SSH port forwarding trick on Linux/Mac (tested with Ubuntu 18.04):

https://gist.github.com/molpopgen/3267efe08a0a4c23835249a955db37a2#file-remotejupyter-md

1) On local machine forward port with: "ssh -N -f -L 8888:127.0.0.1:8888 username@your.server.org" 2) On your server run "jupyter notebook --no-browser ... ... ... " (see Colaboratory instructions for the additional "..." arguments) 3) In the browser on local machine open the URL that gets printed at the end of step 2 4) Now "Connect to local runtime" in Colaboratory connects to the server and not to the local machine

jefflarkin commented 4 years ago

@bterwijn That's not always a viable option. If you're trying to lead a class, for instance, you can't assume that all of the students will have the capability of setting up port forwarding from their local machine to your hosted instances. Most of the classes I lead the participants are restricted from installing software on their machine at all, which is why using something like colab via a web browser is really attractive.

bterwijn commented 4 years ago

@jefflarkin I understand, but if it is possible without ssh security then everyone can potentially remotely take over your server. The feature you propose can become a big security threat so I don't think they will provide it soon. On Linux the ssh client is pretty standard, that is all your students need if you are prepared to give them ssh access. If then problems arise it is your fault and not Colaboratory's.

I think the ssh trick can probably help out others (e.g. working on a small laptop but computing on a remote server with big GPU). Good luck.

ziatdinovmax commented 4 years ago

@jefflarkin Re: teaching using cloud computing, I recommend to check out the “education” option in GitPod: https://www.gitpod.io/education/

jefflarkin commented 4 years ago

@jefflarkin I understand, but if it is possible without ssh security then everyone can potentially remotely take over your server. The feature you propose can become a big security threat so I don't think they will provide it soon.

@bterwijn That's a very narrow view of security. The server can certainly be secured without the need to use SSH tunneling. For one, it may be a private infrastructure with no public exposure. For another, firewalls exist. The SSH tunnel approach that Colaboratory currently provides is pretty much a hack and using tunnels I could hack it further to tunnel through to any server I want, but that's not a generic solution that I can expose to novice users. We currently use Jupyter and have its server well-enough secured for our needs, but Colaboratory has a lot of nice enhancements to standard Jupyter that we would love to provide to our users.

On Linux the ssh client is pretty standard, that is all your students need if you are prepared to give them ssh access.

First, you're assuming they're running Linux. I already said that most of the people I deal with have no choice in what OS they run or what software is installed on their machine, so I cannot assume that they have an SSH client or that they are able to establish a tunnel to my server. Second, you're assuming that giving them access requires them to have SSH access to the server. The existence of colab.google.com demonstrates that this is not the case. The appeal of colaboratory is that you can assume that the user requires nothing more than a web browser. Google's hosted colab is an amazing tool, but when you're dealing with licensed or export-controlled software, it's not an option. We'd like to provide our users the same experience but in an environment where we can install the requisite software on the server and provide an environment that meets any additional security needs that the application or user may have.

If then problems arise it is your fault and not Colaboratory's.

That much I agree with, which is why I want them to make it possible to host the server on our infrastructure and use their improvements to the UI/UX. The security is not a concern, it's a solvable problem (or else Google wouldn't so freely give access to their own colab servers). I take the responsibility of my own users' and infrastructure's security.

jefflarkin commented 4 years ago

@jefflarkin Re: teaching using cloud computing, I recommend to check out the “education” option in GitPod: https://www.gitpod.io/education/

Thanks for the tip. I'll file that one away.

sp7412 commented 4 years ago

Is it possible to host a collaboratory instance on an air gapped linux server?

EricCousineau-TRI commented 4 years ago

@ziatdinovmax I was curious and checking out the link you posted, it looks great (e.g. provisioning / forking containers with easy remote connections and UIs) --- but doesn't have anything as far as offering Jupyter Notebook integration out of the box.

Can I (maybe lazily) ask you to connect the dots for me?

EhsanKia commented 3 years ago

What are the current limitations/blockers for allowing the UI to connect to any remote hosted jupyter server by URL, rather than just one on localhost? Is it a security/https issue? Is it a websocket limitation?

amarantolaw commented 3 years ago

upvote

My company require all analytics to be done on-premise. Will be great if colab can be deployed as container on-premise.

cperry-goog commented 2 years ago

One possible answer to this is our new Marketplace launch: https://console.cloud.google.com/marketplace/product/colab-marketplace-image-public/colab

You're able to configure a VM in GCP Marketplace and connect a Colab instance to it.

xihajun commented 2 years ago

One possible answer to this is our new Marketplace launch: https://console.cloud.google.com/marketplace/product/colab-marketplace-image-public/colab

You're able to configure a VM in GCP Marketplace and connect a Colab instance to it.

Does that work on GCP only?

lucasew commented 2 years ago

To put all the people in a virtual network to share a jupyter instance you all could use a mesh network like tailscale or zerotier

xihajun commented 2 years ago

To put all the people in a virtual network to share a jupyter instance you all could use a mesh network like tailscale or zerotier

Thank you 😊 I will check them out

magnusbarata commented 1 year ago

VS Code has a similar feature to this, which makes me wonder why google colab doesn't already have it. Jupyter notebook on VSCode is quite basic, and I prefer rich interface provided by colab. Moreover colab has the advantage of easier sharing and setup.

Really looking forward to have this feature implemented. Also, IMO it is the host responsibility to keep the Jupyter server host secure.

anaganisk commented 1 year ago

Doesn't https://github.com/jupyterhub/jupyterhub solve the use case of many commentators here?

lucasew commented 1 year ago

Doesn't https://github.com/jupyterhub/jupyterhub solve the use case of many commentators here?

Yes, but in this case, its like comparing Google Docs with Word or Libre Office

Jupyter only doesn't sync the notebooks between people

anaganisk commented 1 year ago

Isnt git or any other vcs supposed to solve the problem of notebook sync, between any people? @lucasew

Dannynis commented 1 year ago

Isnt git or any other vcs supposed to solve the problem of notebook sync, between any people? @lucasew

unfortunately humanity still hasn't managed to create VCS that is suitable for notebooks...

standalone version of colab could solve so much.. much anticipated

anaganisk commented 1 year ago

@Dannynis i see your point, just incase this helps then. https://nextjournal.com Enterprise is available for self-hosting

Also found some other tools that does improve diffing situation with jupyter with a quick google search. It's not completely un-doable.

fcnjd commented 5 months ago

@colaboratory-team My usecase: I am blind, therefore use a screenreader with a Braille display. Currently I'm studying computer science and one of the subjects is Machine Learning. Regarding the accessibility, the Google Colab frontend is much better than Jupyter lab, all buttons are labelled, notifications when cells are saved are broadcasted and shortcuts like shift+enter work as expected. However, the wifi in our university is not always stable. Therefore, I have to decide whether I want to work on a good frontend or have the notebook reliably executed and saved. Self-hosting would make this much easier, since then I could get independent of our wifi. I'm pretty sure that I'm not the only blind student in this situation, so would be happy to see this in the future.

voycey commented 1 month ago

I would just like to throw some weight behind this as a long time user of Colab, it would be awesome to be able to import the Google Colab tools into Jupyterlab for the times when I cant use Colab and Runners to do what I need. One of my current clients has a very restricted environment and would love to have a similar developer experience to what Colab provides.

https://github.com/googlecolab/colabtools this repo seems to have the majority of the tools, is there some documentation on how we can import these to Jupyterlab to get them working? I appreciate the repo explicitly calls out that its not meant for private use