bitcoinops / taproot-workshop

Taproot & Schnorr Python Library & Documentation.
MIT License
369 stars 111 forks source link

Google Colab as alternative to local Jupyter Notebook #145

Closed 0xB10C closed 4 years ago

0xB10C commented 4 years ago

Compiling the Optech Taproot V0.1.4 release and setting up a local Jupyter Notebook can be a big hurdle when trying out the Taproot Workshop on a local system. To lower this hurdle a cloud Jupyter Notebook, which provides a clean environment and a one-click setup, would be ideal.

Google Colab offers such service for free (free as in "you have to login with your Google account"). It provides root access to per notebook VMs. The setup (get custom bitcoind, setup SOURCE_DIRECTORY, ...) can be included in a setup Jupyter Notebook cell running shell commands.

The custom bitcoind binary could be provided in the GitHub release of the Optech Taproot branch and downloaded by the setup cell. As well as the bitcoin/test/config.ini, which is required to run the TestFramework.

I've build a quick and dirty proof-of-concept for the 0.1-test-notebook.ipynb with the help of @jachiang: colab-PoC-0-1-test-notebook.ipynb (source without Google login is in this gist).

Next step would be checking if the other notebooks work similarly well.

jachiang commented 4 years ago

Concept ACK! Big thank you @0xB10C for setting this up, I am pretty excited about this.

A couple notes from playing around with your PoC briefly:

Suggestions as previously mentioned:

What do you think @jnewbery @bitschmidty?

0xB10C commented 4 years ago

The setup is surprisingly fast at around 90 seconds. So tolerable even if notebooks do not share VM environments and need to setup bitcoind + dependencies individually.

I think setup time could even be reduced to a few seconds by shipping the release with a statically build bitcoind.

elichai commented 4 years ago

Concept ACK. Does everyone shares the same VM? Can people harm other peoples environment? Do they share the same .bitcoin datadir?

0xB10C commented 4 years ago

Does everyone shares the same VM? Can people harm other peoples environment? Do they share the same .bitcoin datadir?

No. Google spins up / gives you one VM per notebook per Google account AFAIK for a max of 12h. It just let's you share your Jupyter Notebook (the *.ipynb file), not the actual VM.

jnewbery commented 4 years ago

Concept ACK. I haven't looked at colab yet, but if everything you're saying is true then this could be a really low-friction way to get people started on the notebooks. I'll play around with it this week.

bitschmidty commented 4 years ago

I like the idea of decreasing friction on the notebooks.

It seems like Binder and JupyterHub are part of the Jupyter ecosystem and may(?) be alternatives.

Is anyone is familiar with either of these?

0xB10C commented 4 years ago

Binder seems interesting as well. A user would have all notebooks in one place and the environment could be build from a config file. I found their FAQ quite useful for a quick overview.

However both Colab's and Binder's sessions end after 12h and storage is ephemeral. I think that could be one of the main problems going forward... Users would loose all their progress, if they don't save it manually.

bitschmidty commented 4 years ago

Users would loose all their progress, if they don't save it manually.

Are there any sort of options for saving?

0xB10C commented 4 years ago

Don't see a good option for saving in Binder right now besides manually downloading when finished and then re-uploading. From their FAQ:

Binder is meant for interactive and ephemeral interactive coding [...].

Colab offers three different options for saving:

I've tested save to Gist and save to Google Drive. Both work good, even with Ctrl+S. If you open up Colab the next time they show you your most recent notebook. Saving (committing & pushing) to GitHub is also possible, but I didn't find it very user friendly.

Additionally there is the possibility to mount your personal Google Drive via Python in a notebook cell, but then you need to open up weird Oauth links and copy and past access tokens.

from google.colab import drive
drive.mount('/content/gdrive')

Going with Colab and a reminder for users to save the progress in either a Gist or on Google Drive would be one of the better solutions.

bitschmidty commented 4 years ago

Thanks for investigating @0xB10C , this is really a neat idea.

I was just testing myself on your previously-close-without-explictly-saving colab-PoC-0-1-test-notebook.ipynb in Colab, and it appears to have saved my state at least to some degree.

ACK on Colab seeming to have the least friction at this point.

0xB10C commented 4 years ago

@jachiang (and of course others), can you think of anything else that should be tested with Colab?

0xB10C commented 4 years ago

@jnewbery did you have time to play around with Colab?

Two open questions where I think a decision from Optech is needed:

  1. Who builds and publishes the statically linked bitcoind-taproot releases needed for Colab?
  2. Should the current notebooks be modified for Colab support (i.e. adding the setup cell at the top)? Or should the Colab notebooks be in a separate branch/fork, which would need to be maintained/rebased? Other ideas?
jnewbery commented 4 years ago

@0xB10C thanks for prodding me, and sorry for not getting to this sooner. I've played around with it and it's great. We should definitely offer this as an option. To answer your questions:

  1. I'm open to suggestions. This is something we presumably only need to do once for every version (ie once now and then once again when we update our bitcoind branch to use sipa's new branch with 32 byte pubkeys, etc). I think it's fine for us to publish these built binaries in the release since they'll only be run in google's colab and not on people's own machines.
  2. Can the colab setup code be extracted into a separate util file and then called from one line in the notebook? If we did that it'd be very easy to maintain a branch that had a single commit adding that line to the top of every notebook. As we update the master branch we can just rebase that single commit on master and there should be very few merge issues.
0xB10C commented 4 years ago
  1. Sounds good! I wrote down some notes when creating the PoC, but feel free to ignore them, if there are better ways.

    • A statically linked binary will speed up the setup in Colab, since we don't need to install dependencies. I followed achow101's instructions from this bitcointalk post for a similar project.
    • Reducing the size of the binary by striping and compressing is probably worth it.
    • The test framework needs the <bitcoin dir>/test/config.ini file with the environment variables and components. I guess it can be created/written to by the colab setup script.
  2. Approach ACK. Probably the easiest to maintain, while not incorporating anything into master that doesn't really need to be there. I'd like to contribute the Colab setup (shell) script.

jnewbery commented 4 years ago

@0xB10C thanks. If you're able to build the statically-linked binary, I'm happy to add it to one of the optech releases so it can be fetched by colab.

jnewbery commented 4 years ago

Code is merged into the Colab branch. Thanks @0xB10C !

I think we need to update the README.md to add instructions on how to run this in Colab and then we can close this issue.

0xB10C commented 4 years ago

As doc: Conclusion was to rebase the Colab branch onto master as needed. For example after bigger changes or additions.