UBC-DSCI / introduction-to-datascience-python

Open Source Textbook for DSCI100: Introduction to Data Science in Python
https://python.datasciencebook.ca
Other
12 stars 9 forks source link

Include ipynb or PR deploy previews for easier PR review of charts #6

Closed joelostblom closed 1 year ago

joelostblom commented 2 years ago

To review PRs efficiently, I think it would be great to see the charts in the PR itself. The most common approach I use to this is to commit the ipynb with outputs, but using deploy previews would also work (I am open to other suggestions of course).

  1. The first option means that we would work in the ipynb and use md or py as the paried format (I don't think we need to work in three formats)? Or is there a benefit with including py file on top of those two for the PR reviews (we can always automatically generate them later)?

    We could change the formats to this:

    formats = "ipynb,md///md:myst"
    
    # If we wanted py also, we could do this but there will be many files in each PR
    formats = "ipynb,md///md:myst,py///py:percent"

    The md///md:myst above puts the md files in a separate subfolder so that source looks less busy. I guess it would be formats: ipynb,md///md:myst in the yaml header but we might need to experiment with the number of slashes there. I think it is a good idea with a folder for the py files if we stick with the current myst + py combo as well

    We could use alt.renderers.enable('mimetype') to include Altair charts as images. This is the simplest to render charts in an ipynb file on GitHub, but for the charts to show up in Jupyter book we need to turn off autobuilding of the book and commit the ipynbs with output.

  2. Use automatic deploy previews of the book in each PR (we should probably do this anyways, not just for the charts). With this approach, we only need to handle the special case of using the data_server backend when working with large data (if this is part of the book). For notebooks where we are planning to use this backend, we could include alt.renderers.enable('svg') to show svg representation of the charts which will work in Jupyter Book even when there is no live Python kernel.

I prefer the second approach, since it requires less extra stuff than solution 1. What do you all think?

joelostblom commented 2 years ago

Sorry for the many edits. I updated my comment above a few times to improve and more clearly express the solution I suggest.

joelostblom commented 2 years ago

I thought of an easier way to do this, so I updated my comment above with that suggestion.

trevorcampbell commented 1 year ago

@joelostblom what is involved in an "automatic deploy preview" ?

My suggestion is that we just check out the PR'd branch, run build_html.sh, and view it in our own browser.

But if there is something nicer than that (with visual diffs, for example) let me know.

joelostblom commented 1 year ago

The nice part with automatic deploy previews is that we would not need to build anything to review the PRs, there will be a link to a live jupyter book site with the latest changes. I set up an example in https://github.com/UBC-DSCI/introduction-to-datascience-python/pull/44. There are no visual diffs, just the convenience of not building locally each time we are reviewing.

joelostblom commented 1 year ago

We decided against having these and building the book manually on each PR instead via docker.