RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.35k stars 1.27k forks source link

Convert pydrake tutorials from Binder to Deepnote #13683

Closed RussTedrake closed 2 years ago

RussTedrake commented 4 years ago

Edit: Original issue title was "Convert pydrake tutorials from Binder to Colab", but we've since decided that Deepnote is better.

The original choice of Binder was because it allowed us to provision the machines with drake via a Docker instance, where Colab does not. But the drake docker instance is large enough that the Binder setup takes just as long as setting up Colab in the first cell of a notebook. We also thought there were a number of things that Colab would not support, such as interactive visualization, but I've worked around many of those (#12645). Colab has also proven to be a much more stable/reliable resource.

But most of all, Colab's connection to Google drive changes the game. Now that it works well, I find myself routinely starting my work on a colab notebook, running it from any device, and (as the name suggests) very easily sharing it with collaborators.

This requires some modifications to the preamble of our notebooks. Here is an example: https://colab.research.google.com/drive/1rhqV8WMo6pNyzOV3TjAth4R9hSlZx64t

@EricCousineau-TRI , @hongkai-dai , or others, any thoughts? (especially on the preamble)?

RussTedrake commented 4 years ago

Note: I am ok supporting both Binder and Colab, but think Colab should be the first priority (sometimes one visualization approach might be a little better in colab vs binder; let's favor colab always). I am also ok if we simple remove Binder from the readme's and do not advertise/actively support it.

EricCousineau-TRI commented 4 years ago

I'm not a huge fan of embedding the preamble in each tutorial, but yeah, I can understand that our usage of public Binder is suboptimal compared to what Google Colab can offer (I think for free, by default?), and cost/benefit of Colab setup time vs. what Binder actually gives us, and most definitely the connection with Google Drive being uber pervasive.

As an alternative to the preamble, is it possible for us to easily specify a Kernel image for Colab to use?

If we were to stick with the preamble, it would be nice to somehow indicate, in each of the tutorials perhaps, when it's necessary for the runtime to be upgraded? Also, mayhaps there's a way to more easily express / robustify the current logic? e.g. we can go from:

%%capture
try:
  import pydrake
except:
 !curl my_url > my_script.py
 from my_script import my_setup
 my_setup(...)

to instead be something like:

%%capture
from urllib.request import urlopen
with urlopen("https://raw.githubusercontent.com/RussTedrake/underactuated/master/scripts/setup/jupyter_setup.py") as f:
    exec(f.read())
setup_drake_if_needed()

In this case, the setup_drake_if_needed could be an entirely encapsulated function (to avoid polluting globals). It could look like this:

def setup_drake_if_needed(min_date=None, max_date=None):
   try:
     import pydrake
     # Ensure we have the proper version of Drake based on `min_date, max_date`. Fail fast if it's too old.
     return
   except ImportError as e:
     # Ensure it's only `pydrake` that's the problem, not a dependency.
     ...

Thoughts?

RussTedrake commented 4 years ago

Yes to the "for free" question.

In my mind, an ideal goal for the preamble should be

try:
  import pydrake
except ImportError:
  !apt install drake

(although I think we'll still need to set the PYTHON PATH)

I don't think we can provision colab, but I'd love to be wrong.
I've been also looking for ways that we can hide/minimize/autorun the first cell, or things like that.

EricCousineau-TRI commented 4 years ago

Yes to the "for free" question.

:+1:

In my mind, an ideal goal for the preamble should be [....]

My only issue with the try: import X; except ... is that it exposes the symbol a bit too early... I would rather it be a fast check_and_setup() so that people are not encouraged to inject add'l module imports there, and instead separate their imports from that stanza. But it's a small nit, so I'm fine with it as-is for now.

I don't think we can provision colab, but I'd love to be wrong. I've been also looking for ways that we can hide/minimize/autorun the first cell, or things like that.

I'm wondering if there's a public forum that we could post your experience of the pros/cons of Binder and Colab w.r.t. provisioning, visualization, etc. We can thank 'em for their awesome work, let 'em be aware of possible issues w.r.t. our workflow, see if anyone has comments, etc.

https://discourse.jupyter.org/c/binder/12 may be a good place for a Binder summary, if you or your TA's haven't already posted there or elsewhere?

Also, possibly posting our questions to a Colab-oriented forum? e.g. https://stackoverflow.com/questions/tagged/google-colaboratory

If I sat down on Friday and stepped through each of the pain points you mentioned, would you be able to review my post(s)?

peteflorence commented 4 years ago

asked by @RussTedrake, here are my thoughts --

I do think using Colab is a great fit for all of these use-cases: (i) first-time users to Drake in general (ii) students in particular (iii) even for expert users, quickly-shareable projects or easy-to-start one-off projects

My personal view is that the current 3D visualization solution Russ already has working is awesome. (You just need to have a separate window with the visualizer in it, but it all streams from the Colab server and works. The only thing that doesn't currently work is embedding that inside a Colab cell, but I actually like it better separate.)

I also think the model that is getting developed here (everything you need in a Colab, including 3D visualization) would be useful/inspiration for a bunch of other projects out there as well.

RussTedrake commented 4 years ago

@EricCousineau-TRI -- possibly relevant: https://github.com/RussTedrake/underactuated/issues/247

If @jamiesnape can land #1183, and we think it's "fail fast" enough, then the entire preamble could be just

!pip install drake
jwnimmer-tri commented 4 years ago

Is the reason you suggest pip here because its just a one-liner, instead of

!add-apt-repository -y ppa:foo/bar
!apt-get update
!apt-get install python3-pydrake

?

The apt work is a short hop away from being complete. The pip work has a meaningful risk of extending past the summer, even if we think it's going to be easy.

RussTedrake commented 4 years ago

Yes. But it’s even more than that... https://github.com/RussTedrake/underactuated/blob/master/scripts/setup/jupyter_setup.py#L6

We would either need to handle mac/ubuntu, or otherwise make sure that someone on mac doesn’t get a failure when they run locally. And we would need the lines to sys.path.append. All together, it goes from something that feels welcoming and consumable to something quite a bit more heavy.

EricCousineau-TRI commented 4 years ago

!pip install drake

On this count, I think that's great for Colab hosted instances, etc. But for me, running locally, I'd hate to have this installed in my ~/.local user site-packages just b/c of the dependency hell that would arise. apt install python3-drake sounds much better, since it would more purposefully (I think) encapsulate itself.

On the count of handling apt vs. brew dispatch, I think we'd still have to put some platform-specific logic in there...


Just as a separate note, I'm trying to see if it's at all possible to run a Google Colab-like UI locally to simplify editing local notebooks - e.g. for a PR where we want to rely on Google Colab metadata, like collapsing input cells: https://stackoverflow.com/a/50842328/7829525

RussTedrake commented 4 years ago

The decision during review of #13697 (and afterwords on slack) was that we should move drake/tutorials out of drake and into a new repo, e.g. drake-tutorials. The motivation is that these tutorials should be runnable/sharable by anybody on colab, without assuming that they are living in the drake source tree, and should have a durable reference to an associated drake binary release. Here are a few key ideas from that thread:

RussTedrake commented 4 years ago

Breadcrumb: We also need to test drake/tools/install/colab/setup_drake_colab.py in pre-merge CI, as discussed in https://reviewable.io/reviews/RobotLocomotion/drake/13697#-MCR4XhbEwraQHsbipoU

jamiesnape commented 3 years ago

Can I suggest we push this forward (particularly the repo split)? It is going to have to be in tandem with #1183 to an extent, but the split repo part can happen beforehand without that blocking.

jamiesnape commented 3 years ago

Also, we could then decide whether supporting Mac for the tutorials is especially worth the effort.

RussTedrake commented 3 years ago

I think we could go definitely "colab only" (no mac) for these. It would presumably not be hard for someone to run them on mac, but we could skip any overhead in guaranteeing that with CI.

Should I start a repo? drake-tutorials?

EricCousineau-TRI commented 3 years ago

Only question here is, will the Colab notebooks be put under CI for quickly checking for issues (e.g. deprecation)? My read from htmlbook is that yes, it'll have CI: https://github.com/RussTedrake/htmlbook/blob/26cb5cd9184f6cba1ed42b750ad81ed6497a6769/tools/jupyter/defs.bzl#L36-L44

Just wanna confirm!

jamiesnape commented 3 years ago

They will be under more CI than Drake, even. That is one of the main reasons for splitting them out.

jamiesnape commented 3 years ago

https://github.com/RobotLocomotion/drake-tutorials. I guess we get some infrastructure in place first.

RussTedrake commented 3 years ago

Update: I am taking deepnote (a colab alternative) for a spin in class this fall. Based on my initial experience, I think the workflow there would be perfect for our drake tutorials. (They would be versioned off our focal dockerfile, and we can schedule tasks to run as an effective CI directly in deepnote).

I will return to this after I better understand the possible pain points.

jwnimmer-tri commented 3 years ago

@RussTedrake once you have a readout, please let us know what you think about Deepnote. We could work on re-locating the tutorials before the end of the year.

RussTedrake commented 3 years ago

The feedback from users has been overwhelmingly positive. I think we should definitely re-locate them.

From a developer/maintenance perspective, I currently peg the docker SHA and manually copy up the notebooks if/when they change. (The deepnote folks are working on an API and I've made the use case of updating the notebooks like this clear to them). The main repo containing the notebooks runs a Github Actions using the latest dockerfile on CI, so that I know if/when anything needs updating.

RussTedrake commented 2 years ago

I've got a project set up now: https://deepnote.com/project/Tutorials-K0_FCa7yQX2kDWBx3-2RmQ/%2Fdynamical_systems.ipynb

My proposal is that we, the drake maintainers, update this repo manually. Binder always pulled from the robotlocomotion/drake:latest docker instance. But with deepnote, I can imagine people copying the project to start their own work... I'd even encourage it. I think it is better and more robust for us to use a specific named nightly docker instance.

To update the tutorials, a drake developer would follow the (trivial) step in: https://deepnote.com/project/Tutorials-K0_FCa7yQX2kDWBx3-2RmQ/%2F.for_maintainers.ipynb

There is a chance we could have the interaction more seamless in the future: https://community.deepnote.com/c/ask-anything/possible-to-launch-a-notebook-directly-from-github

I am going to open a few PRs to touch up the tutorials to make things run more smoothly on deepnote/colab, and will update the tutorials/README.md and public links to tutorials once we are happy. Comments welcome.

RussTedrake commented 2 years ago

After #16279, I still need to:

jwnimmer-tri commented 2 years ago

When I open https://deepnote.com/project/Tutorials-K0_FCa7yQX2kDWBx3-2RmQ/%2Fdynamical_systems.ipynb to try out the tutorial, and then I click "Run notebook", the page asks me to create an account and sign in with either Google or GitHub.

Which authentication method should we be telling Drake Developers to use? Does it matter? That should be part of our developer documentation.

Similarly, I suppose new users interested in learning about Drake for the first time will immediately hit this roadblock as well. Will making an account be an impediment to their use? I had not realized / recalled that we would be putting an auth pop-up in the middle of our first-time user UX now. Should our introductory materials atop the tutorials explain that this step will be required?

My proposal is that we, the drake maintainers, update this repo manually.... To update the tutorials, a drake developer would follow the (trivial) step in ...

There are two topics in play here:

(1) Which revision of pydrake will be used? It seems like choice-of-pydrake is currently coming in via the Docker image base we choose, e.g., focal-20220106. Is there a benefit to using the Drake Docker specifically, versus using a vanilla Docker base and then doing !pip install pydrake==0.36.0 atop the notebook? By placing the pydrake version into a notebook cell, the pydrake version would be explicitly matched with the notebook, making for easier forking -- all of the information is in one place.

I see now that https://docs.deepnote.com/environment/custom-environments has some advice on this matter. Are you trying to avoid the "hardware restart will nuke pydrake" failure mode? Does that interrupt users often? If that's the case, then using a fixed Environment makes sense, and we can stick with Drake Docker. In the future it might make sense to switch back to a deepnote Docker with a simple requirements.txt (pydrake==0.36.0) in the Environment, instead of our full Drake Docker image (which is quite bloated), but that is not urgent.

In any case, I think that manually re-pinning the notebooks to refer to a newer pydrake will be a workable approach, no matter which mechanism we use to capture the pin.

Soon (i.e., concurrently with switching to Deepnote as our preference), we should update the https://drake.mit.edu/release_playbook.html to ensure that the tutorials' pin is never older than the most recent stable release.

(2) How the notebook's source code is copied from drake.git into Deepnote. Currently, the tutorials/README instructs the PR author to manually copy the file themselves. This manual step will not scale, and it's important to me that we make this step more robust, prior to (or concurrently with) switching to Deepnote on our website. We might need to script this (via some kind of Deepnote uploading API), but there are other factors to consider.

Above, we'd said:

The decision during review of #13697 (and afterwords on slack) was that we should move drake/tutorials out of drake and into a new repo, e.g. drake-tutorials.

We should discuss that idea in concert with this part (2) updating process.

Another tactic in play would be to ship the tutorials in the pip wheel (and in Docker), so that they are versioned along with the releases. I'm not sure how easy it is to execute them from a wheel, but it would at least get the contents archived somewhere that is easy to refer to.

hongkai-dai commented 2 years ago

Echoing Jeremy's question, currently tutorials/README says

Once your PR has landed, use this maintainers notebook to update Deepnote. Drake developers should request "Edit" access through the Deepnote interface if they do not have it. Note: If your updates depend on changes to Drake outside of the tutorials directory, then you will have to wait for the updated nightly binaries to update Deepnote.

What happens when I work on the dev branch with a tutorial? Should I manually copy my notebook to that maintainer's notebook, and see how it renders?

RussTedrake commented 2 years ago

@jwnimmer-tri -- I asked the platform reviewers very specifically about putting OAuth in front of first time users. The feeling at the time was that it was not ideal, but ok. I use Google, but I don't think it matters (personal preference). Note that this is true of using Colab as well.

(1) I've been using the tagged docker release method for class, and it has worked well. Pros for tagged docker releases:

Pros for e.g. !pip install drake==0.36.0 as the first cell:

(2) Here is the deepnote upload API: https://deepnote.com/@deepnote/Execute-notebooks-using-the-Deepnote-API-1KI0vDWFRY-9uqp8wSrE_A . it's still in beta, but we can request access for the drake project. As I said above, that would may be sufficient if we choose the pip route. If we keep the tagged docker, then it's two steps: a script to upload and a manual update of the docker instance.

On reason to favor manual copying the file to deepnote helps ensure that the file got tested on deepnote. Even if we strictly enforce that PR authors provide a link to a deepnote instance, there could still be copy/paste differences during the PR process. If we upload (via a script) many notebooks at once, the chance that the uploader carefully checks them all decreases.

Regarding moving things to drake-tutorials... I disagree that this decision is necessarily coupled with this issue. The deepnote project can be forked directly, and will be durably tagged with the drake release. We don't need drake-tutorials to accomplish that. There may be other reasons to have drake-tutorials (like having examples with drake + pytorch, without including pytorch as a dependency for drake), but I don't want to conflate that with this issue.

@hongkai-dai -- I think you can duplicate the project to get your own version, make your changes there, and then open the PR. We can make that more clear in the README.md if need be.

jwnimmer-tri commented 2 years ago

For the record -- my goal is to finish this work by Monday 2022-02-28 at the latest.

jwnimmer-tri commented 2 years ago

The discussions upthread imply that we would still advertise Colab as a supported venue for tutorials (e.g., the checklist item Add !pip install drake pyngrok preamble for colab).

However, I don't think we should do that. As part of this Deepnote transition, I think we should officially withdraw the tutorials from Colab. Due to https://github.com/googlecolab/colabtools/issues/1880, Drake will drop support for Colab in less than two months, when we retire our 18.04 support.

I'll open a PR with this proposal, so we can discuss it there => #16568.

jwnimmer-tri commented 2 years ago

I don't think we should use ngrok for anything. Running notebook displays through someone else's severs is painful and will always break, sooner or later (usually sooner). In my judgement, it's a non-starter.


I think to solve the MeshCat networking on Deepnote's hosting, the following might work:

(0) Toggle the Deepnote environment setting to open up port 8080.

(1) Do a meshcat = Meshcat() as usual, choosing the first free port on the machine per the C++ port scanning (e.g., it might be 7003 if there were several notebooks already running).

(2) Launch an nginx http and ws proxy daemon, configured to listen on http://0.0.0.0:8080/7003/... and forward that to http://127.0.0.1:7003/.... (The ws forwarding would be the same.)

(3) Tell the user to open http://DEEPNOTE_PROJECT_ID.deepnoteproject.com/7003/ to see MeshCat. They will hit the nginx proxy which will forward it to our notebook's server socket. The javascript logic on the landing page will need a small change to keep the 7003 intact, connecting to ws://{location.host}/7003 instead of throwing away the location path entirely.

We could probably launch the nginx daemon from pure Python as part of StartMeshcat(). It would also be easy to bake it into the Deepnote Environment as an Ubuntu service (with monitoring and logging), if we liked that better.

WDYT?


I also have a prototype where I have a notebook open the MeshCat HTML + JS in a new tab and feed it messages from the notebook -- a lot like you all explored with the #12645 thing, just in a new tab this time instead of an iframe. I have it running in stock Jupyter without any extra servers or sockets, using the Jupyter kernel comms. However, Deepnote does not seem to expose that feature anywhere. I imagine that with the multi-user setup, it's more complicated than a single-user notebook. If we can't get the nginx daemon working, we could ask Deepnote for help getting the kernel Comm up and running, and then we don't need any open ports at all, we'd piggyback on the existing connection.

jwnimmer-tri commented 2 years ago

(2) Launch an nginx http and ws proxy daemon, configured to listen on http://0.0.0.0:8080/7003/... and forward that to http://127.0.0.1:7003/.... (The ws forwarding would be the same.)

I have this working locally (git branch is here). I'll try it out on Deepnote now.

=> #16639

jwnimmer-tri commented 2 years ago

The #16639 should be close to its final form, now. Still need to figure out how to manually test it thoroughly.

RussTedrake commented 2 years ago

This sounds awesome. Just to confirm... this does not and can not address Colab, correct? (you got my hopes up when I saw you wrote "I'll try it on Colab now", but I see that it's been edited Colab=>Deepnote.). FTR -- I have a thread on ngrok alternatives here: https://github.com/RussTedrake/manipulation/issues/146 .

jwnimmer-tri commented 2 years ago

In other news, I've dug into the Deepnote permission options, in terms of pinning and upgrading the environment (to provide a specific version of Drake, and our dependencies). Rico helped me learn and explore some options. Here's what I've learned:

The only pinning technology we cannot use is the one currently being used -- where the "Environment Docker image" specifies a DockerHub tag. Even with Edit access to the Tutorials project, a Drake maintainer cannot change this pin. This is what it looks like for me:

image

However, all of the other options work:

(1) Set the Environment to use ~/work/Dockerfile -- a project Editor can change the contents of Dockerfile and re-run it. We can still have it based on drake-focal-yyyymmdd if we like (by saying that atop the file), or we can use a stock Ubuntu 20 image and rely on pip to install pydrake and its dependencies.

(2) Add the stock init.ipynb notebook and use requirements.txt -- a project Editor can change the contents of requirements.txt and re-run it. We could pin the version of Drake here.

I'll write up more pro/con about our pinning and deployment technology later, but I wanted to get this investigation into the record first.

jwnimmer-tri commented 2 years ago

This does not and can not address Colab, correct?

Right, the currently-proposed configuration (#16639) of the nginx daemon would not help with Colab. It multiplexes all of the meshcat servers onto a single server port 8080, so that they all have an inbound route from the public internet.

The version of this tactic for Colab would be doing that and also auto-configuring the ngrok outbound connection from nginx to the outside world, as part of the daemon. Then the StartMeshcat could interrogate the daemon for what the ngrok URL was and display that to the user. Overall, it would not necessarily be more ergonomic that the current ngrok glue code, though it might more easily allow for using multiple Meshcat() objects per notebook.

RussTedrake commented 2 years ago

Re: colab. Good point. It might be still be an improvement over the current ngrok code, since there is a limit to the number of ports we can tunnel on the free ngrok account.

RussTedrake commented 2 years ago

Re: deepnote, you should definitely be able to update the docker instance. I just looked at the settings, and see that by default you were added as "edit" access instead of "full access". Moreover, you were added to the project, but not the team. I've upgraded your status on the project and sent you an invite for the team.

jwnimmer-tri commented 2 years ago

I just looked at the settings, and see that by default you were added as "edit" access instead of "full access".

The "full access" permissions appears to allow me to Delete the entire Tutorials project. Is that a level of access we want to give out widely?

Whatever access level we choose for updating tutorials does need to be given out widely, e.g., to most people who have commit access to merge Drake PRs, who are regular contributors, etc. It's not valid long-term for a select few of us to become a deployment bottleneck for changes to tutorials. It seemed to me like "full access" was too much power to give out for that use case.

I'm fine if a few of us have "full access" to help deal with emergencies (in the same way that we have Jenkins admins and GitHub admins), but for pushing out updates my take is that "edit" access should be the only prerequisite.

jwnimmer-tri commented 2 years ago

An update on my remaining plans here:

jwnimmer-tri commented 2 years ago

Rico and I did a little experiment today.

I made a test project:

I ran the project to populate the cell outputs, then shut down my machine.

Rico was able to open the project and see the outputs. He could not run it, but Deepnote provided a pretty clear "Duplicate" button to do so.

After that was clicked, Rico had his own copy of the entire project (all notebooks), and it preserved all of the Environment settings -- local Dockerfile, "allow incoming connections", etc, and he could run everything successfully.

That seems like a great workflow for our Tutorials.

For deployment, we chatted and we think publishing to Deepnote only upon stable releases will be fine. We'll do manual review of the notebook outputs to make sure there are no exceptions, and that all of the math markup rendered correctly. The process can be semi-automated and the manual steps will go into the release playbook.

For users who want latest-nightly tutorials, they can run those locally using either the nightly tgz, or (soon) the nightly wheel builds pip install https://drake-..../latest.whl.

RussTedrake commented 2 years ago

FYI -- I discussed this workflow with the Deepnote developers quickly, and currently understand that we need to manually "build" the docker image in the Deepnote console whenever we update the Dockerfile. I have a discussion started with those developers to make sure I understand the implications ("once per project? once per user?"). This will have to be part of the manual steps in the playbook.

jwnimmer-tri commented 2 years ago

Most of the work here is done. The tutorials are up and running well.

There are a couple of less critical maintenance tests now filed as follow-up issues (#16969, #16970).