conda-incubator / condacolab

Install Conda and friends on Google Colab, easily
MIT License
341 stars 49 forks source link

Feature: add option to store and reload installation from Google Drive #8

Closed alexmalins closed 3 years ago

alexmalins commented 3 years ago

Description

This PR adds functionality allowing Colab conda installations to be saved to tarballs on Google Drive. Everything can then be reloaded from Google Drive. This streamlines the process of setting up the environment with your necessary packages on Colab.

I uploaded an example notebook demonstrating the new feature here.

Inspiration for this PR came from this thread on the development list for the PyNE project.

Todos

Questions

Status

jaimergp commented 3 years ago

Hey Alex, thanks for your PRs (both of them!). Just letting you know that I did see them and I am super excited about the new features. I am unfortunately too busy these weeks, but I should have some more time in the second half of the month!

alexmalins commented 3 years ago

Understood, thanks for the update!

jaimergp commented 3 years ago

Is there a reason that I'm missing that makes restarting the kernal necessary?

Yes, PREFIX/lib is not part of LD_LIBRARY_PATH, so we restart the kernel to force the interpreter process to load the libraries from that location.


With respect to this PR, I am inclined to reject it, unfortunately. There are better ways to cache an environment in reproducible ways while avoiding some pitfalls (binary relocation, updates, etc) and saving some space. I can help set up the documentation for that, but the idea is to use constructor to build a Miniconda-like installer and store it somewhere†. It must be Python 3.7, but other than that it's pretty straight forward.

† Right now the installer function assumes online locations, but we could change it so it takes a path in Colab itself, if that helps.

alexmalins commented 3 years ago

Thanks. The main goal of this PR was enabling quicker set up of the environment, so people can get coding as fast as possible on loading Colab.

It would be good instead to have a PR with documentation of the constructor solution. Or maybe an example folder in the repo containing a construct.yaml file (for e.g. openmm) and simple instructions for how to use constructor to generate a custom installer, how to host it (e.g. on GitHub or Drive), and how then to use it CondaColab. This would be helpful for users who are not aware yet of this solution (I wasn't! I tested it and it works great for PyNE. Especially when hosting the installer via a URL, which saves timing authorizing access to Google Drive).

alexmalins commented 3 years ago

E.g. this is the minimal construct.yaml I used for PyNE: https://gist.github.com/alexmalins/0df7ceeafb3c456b5dd9b78a2d701d71

jaimergp commented 3 years ago

Yes we can provide a template with the restraints Colab imposes and document it in the README. Shall we start a new PR for that?

alexmalins commented 3 years ago

Yep I think that makes sense 👍