Open choldgraf opened 4 years ago
Another thought on the generated notebooks - @amueller mentioned that authors may be hesitant to include MyST-specific markdown in Jupyter Notebooks they expect their readers to download and run, because the Jupyter interfaces don't support MyST markdown.
So, I wonder if another feature here could be that MyST-NB also uses Sphinx to output "regular markdown" in the "downloadable" notebook for input notebooks / MyST-notebook files (e.g. that have regular markdown links instead of {ref}
in there.
That could be tricky to do, but might be a way to satisfy this condition before we get support for myst-markdown in the jupyter interfaces themselves.
e.g. that have regular markdown links instead of {ref}
I’m not sure what you mean by this, can’t you already use regular links?
Yeah, that was a bad example. I mean things like admonitions, figure or equation directives, etc
yeah doing a figure means that if someone looks at the notebook they will not see anything, which is not great.
A figure doesn't have a "regular" markdown equivalent though, that's the main reason for using directives; to extend markdown.
You can use an image ![alt](image/path.png)
, if you want to have it show up in markdown, but then obviously you can't have a caption.
The emphasis here IMO should be to provide equivalent Markdown syntax extensions, using markdown-it/markdown-it-py, to allow you to write in syntax that Jupyter will support in the first place (rather than doing any post-conversion). For example https://github.com/executablebooks/MyST-NB/issues/126#issuecomment-622935821 will allow you to write equations without specifically using the math directive.
Similarly for admonitions, I want to write an extension to allow for use of fenced divs for admonitions:
:::{note}
My note
:::
Yeah I agree that the best long-term solution here is to support this syntax in Jupyter interfaces via something like a MyST plugin (in jupyterlab / notebook / vscode / etc).
I get the vague impression that jupyterlab 3.0 is going to change the extension bundling machinery so that a user-selectable markdown parser would be easier to deploy/implement?
https://github.com/jupyterlab/jupyterlab/pull/8385
There's also https://github.com/jupyterlab/jupyterlab/issues/272 where they're discussing using markdown-it
as the markdown parser in jupyterlab. If that lands, then it would be much easier to build MyST functionality on top of that parser, since markdown-it-py
has much of the same structure
@chrisjsewell I'm not sure I follow. Sure, there is no direct equivalent in markdown, though you could create html that's equivalent. For numbering and referencing that would need to be either post-processing or somehow needs to be supported by jupyter lab.
I think the goal I have in mind is pretty straight-forward: I want a jupyter notebook that has the content that I wrote but that can also be executed. Right now, I can produce content via jupyter notebook as an editor, but there is no way to view the content as a jupyter notebook. I.e. there is no way for the user to execute the code while seeing the figures.
Having an extension would certainly make jupyter notebooks a better editor for writing jupyter book content, but I don't think it's a feasible solution for the consumer side: it's a giant barrier to entry to ask someone to install an add-on so they can read your notebook [unless Anaconda has installed this add-on by default].
@amueller I think I see your point-of-view 😬 but I think it would be best if you could provide a minimal example of a notebook, that you think is currently "unreadable", that we can talk around, and perhaps an example of what you think the notebook should look like
I wouldn't say it's unreadable, but some parts are missing. Figures are missing, notes are missing, sidebars are missing - unless the reader clicks into a markdown cell and finds some directive that's not supported and reads the content. Though for a figure she still won't see it unless she edits the markdown.
Maybe a minimal example is a notebook with a figure and a note. If you open that in Jupyter, it will show two empty markdown cells, aka a white page. What it would ideally show is a figure and a note.
Some things are not easily possible in jupyter I think, like sidebars. But I'd prefer to have a sidebar rendered inside the text rather than have it completely hidden from the reader.
If you open that in Jupyter, it will show two empty markdown cells, aka a white page.
Are you sure about that?
```{figure} https://miro.medium.com/max/512/1*d69DKqFDwBZn_23mizMWcQ.png
This is my caption
This is a note, but it won't be *formatted*
<img width="759" alt="image" src="https://user-images.githubusercontent.com/2997570/82617214-4cf2d000-9bc7-11ea-90d8-fe1863a34980.png">
Then what I meant by adding extensions to markdown-it-py, is that you could then write something like this, which renders in the notebook (with no add-ons) but would still be parsed correctly by MyST (given you activate the extensions).
```md
![](https://miro.medium.com/max/512/1*d69DKqFDwBZn_23mizMWcQ.png)
!This is my caption
:::{note}
This is a note, and it will be *formatted*
:::
Yeah - I think that basically the only options are:
To me, 2 is a cleaner and longer-term solution (maybe also simpler as well?). This is just a limitation of the fact that Jupyter only supports CommonMark, which doesn't have support for any of the fancier formatting we're talking about (which is why people tend to hack the same results with raw HTML)
@chrisjsewell Hm you're right, it is not formatted but it's there. Somehow I thought I was missing content, but I guess that was only figures. I'll see if there was something else missing.
@choldgraf I think @chrisjsewell had something in between in mind (for now) which basically renders reasonably ok in Jupyter.
I would totally agree that 2 is the cleaner and nicer long-term solution. We'll see how my book evolves. But while @chrisjsewell's solution would be better than the current situation, I don't find it entirely satisfying. I'll be putting hundreds of hours into formatting these pages, I can put another couple hours into a CI job that replaces the myst markdown with some html.
I'm not saying this is a solution that should be supported by jupyter-book, as it is a bit ugly and adds more abstractions and moving pieces, I'm just saying, as someone writing a book, I'd rather have the extra work than have ugly formatting in my book.
there is no way for the user to execute the code while seeing the figures.
BTW what you're talking about is also reminiscent of https://jupyterbook.org/interactive/launchbuttons.html?highlight=thebelab#live-interactive-pages-with-thebelab. I'm certainly not saying your use case doesn't have merit, but surely the point of creating a HTML book is that people read that, rather than downloading all the individual notebooks, having to open them via Jupyter, and then reading those?
but surely the point of creating a HTML book is that people read that, rather than downloading all the individual notebooks, having to open them via Jupyter, and then reading those?
@chrisjsewell I guess that's the disconnect. To me both are equally important. I want a book that is available as executable jupyter notebooks and as rendered website [and as printed book probably]. I might even be tempted to say the executable notebooks are more important than the website. If that's not the goal of jupyter-book, then that's of course fine, but it's certainly my goal. And I don't think of it as 'creating an HTML book'. I think of it as writing a book, and wanting to provide as many convenient ways for people to consume the materials as possible.
Good point - I think there will always be trade-offs, but I think in general we should try to push for a top-quality experience in each of: the content files themselves, the rendered HTML, and the rendered PDF. In the current phase, I think we are probably prioritizing them in the order of HTML > PDF > ipynb
, but I think this will shift back-and-forth over time
Yeah agreed, there's certainly trade-offs. Fixing the PDFs will be technically somewhat simpler in my experience (I went through all of this when I wrote my last book, which is entirely in jupyter and was converted to asciidoc). It "just" means fixing the latex that's generated. Though actually there's some issues there es well if you're using pandoc (are you?), because the internal representation of pandoc is somewhat restricted, IIRC, pandoc can't do cell spans in tables and so you can't directly use it to create latex that does. Also pandoc doesn't convert raw html that's inside markdown. You can probably see all the pandoc issues I opened 4 years ago still ;)
No we use it https://github.com/executablebooks/markdown-it-py
That's for parsing the markdown, not for generating the latex, though, right? Oh is it sphinx generating the latex? I guess that has it's own engine that's not pandoc. I know very little about that.
Yes markdown-it-py parses to its representation of tokens, then myst-parser converts these to the docutils node tree used by sphinx, which has output specific builders.
One thing that's worked fairly well for us to bridge the notebook/rst personalities of an md:myst file is to make sure that every figure is isolated as a jupyter cell using the jupytext cell delimiters, with a simple cell metadata tag like 'fig'. So turning the md:myst file into a notebook that doesn't scare students just requires a script that uses jupytext.read to get it into the nbformat tree, an operation that transforms the figure cells, and jupytext.write to write the denatured notebook, sync and execute.
another issue in a similar vein, just for another datapoint: https://github.com/executablebooks/jupyter-book/issues/629
we've started to get a few questions from people saying they are confused because the MyST syntax doesn't display in Jupyter environments (e.g., in the issue above it is the .. figure
directive...)
I think admonitions and image
/ figure
directives are the main ones to prioritize, in terms of extensions for better "round-tripping", maybe we want to spin that off into a separate issue.
For the latter, perhaps direct parsing of HTML img
tags into the doctree might be feasible (using beautifulsoup to actually extract the tag options)
Yes, this would be ideal for our teaching. A typical course setup will have a textbook or lab manual written in jupyterbook with crossreferencing, equation numbers, figure captions etc., and a set of student labs, which they will work on in jupyter. As long as the figures can be sized correctly in the notebook, things like cross-refrences are a minor detail -- students can just click over to the html/pdf to see the fully rendered text.
Yeh cross-referencing is probably not easily possible, because by default in sphinx they are also cross-document
Totally agree with what is said here. So @phaustin your workflow is having a master md:myst and generating a book and a notebook from it with the notebook getting some extra polish to render nicely? That's basically the workflow I had imagined only my source would have been a notebook with myst, which should be very similar.
@chrisjsewell so round-tripping sounds like doing the conversion, not having directives that work in both environments as in https://github.com/executablebooks/MyST-NB/issues/148#issuecomment-631799021 ?
My setup is very similar to @phaustin, and having students install an add-on can be quite a big barrier.
@amueller -- yes, our holy grail is a single myst:md master, with derived versions that have provenence via scripts and metadata giving topic, level of difficulty, whether a cell is a question or a solution, answer key letter etc. So for a quiz, we can write the solution we'll eventually post, strip the cells with the answers, construct the answer key, print a pdf for an in-class exam, or convert to canvas (our lms) qti xml for an online quiz.
so round-tripping sounds like doing the conversion, not having directives that work in both environments as
Well I just mean that myst, on parsing, would read an HTML img
tag as an image
or figure
directive. You would have to write your source documentation using HTML images (rather than the directives), if you wanted the downloadable notebooks to be that way, but then this avoids having to do any one-way post-processing of notebooks
Ah, ok. But then I still can't do cross-referencing, right?
I think @phaustin wants cross-referencing in the source document (or at least in one of the versions of the document) - at least that's what I want. Or do you mean you'd write html with some extra syntax that could then be read by myst to create the references?
What I want and what I understood @phaustin to want is: a) Have an html & pdf export that has cross-references and all the niceties jupyter-book currently has. b) Have a jupyter notebook (either as source or as export) that renders figures and notes reasonably well and doesn't scare students / readers with weird syntax.
Bonus: c) Have it written in a version-controllable form (i.e. myst:md).
I'm not sure how your solution achieves a).
For us, the image/figure swap plus perhaps a howto on filtering myst markdown would be about all we would need to get good-enough jupyter notebooks. If you did get markdown-it-py into jupyter as an extension, we would definitely use that on our large first year courses that are running on jupyterhub in the cloud. If it was possible to install a single jupyterlab extension via a conda environment.yml file then I don't see any problem using the extension in smaller classes where the students are using their own laptops.
Ah, ok. But then I still can't do cross-referencing, right?
No that would be non-trivial, so I don't think would be a short/medium term goal
Have a jupyter notebook (either as source or as export) that renders figures and notes reasonably well
I think this is a reasonable short-medium term goal
and doesn't scare students / readers with weird syntax.
Well that depends on how much of the "sphinx" functionality you want to use. Essentially roles and directives are the primitives of the MyST "language", then any other syntax are alternatives to these; to improve usability/readability. Naturally it would be unfeasible to provide an alternative syntax for every possible role and directive, but we can look to provide them for the most widely used ones.
@phaustin can you elaborate on
For us, the image/figure swap plus perhaps a howto on filtering myst markdown would be about all we would need to get good-enough jupyter notebooks.
I'm not sure I understand what you mean. I thought you already had custom processing to do that?
yes, but I'd be happy to exchange those regular expressions for unambigous information from the parser. (this is strictly wish-list though, at the moment the only processing we do is to comment/uncomment the markdown/html image versions in a figure cell).
I put together a POC script to try option 1. from https://github.com/executablebooks/MyST-NB/issues/148#issuecomment-632407608 "Find ways to inject raw HTML into generated notebooks".
Our main use case is for admonitions : we want to keep using admonitions in JupyterBook and we want them to look decent in Jupyter notebook interfaces. The reason is that people follow the notebooks along when we give the course.
The way it looks can be seen here: https://github.com/INRIA/scikit-learn-mooc/pull/152#issuecomment-748096323
The script doing the conversion from py:percent
notebooks using MyST admonitions to ipynb
files with rendered HTML admonitions is here:
https://github.com/INRIA/scikit-learn-mooc/blob/master/build_tools/convert-python-script-to-notebook.py
There is probably a lot of room for improvements, so suggestions more than welcome! I am guessing that there are some limitations too, for example nesting admonitions is probably not going to work.
The basic idea behind it:
HTML("<style>put_your_css_here</style>)
or custom.css
.It feels like I am doing MyST-markdown to CommonMark conversion, so what would a cleaner strategy look-like, would writing a CommonMarkRenderer
class makes any sense?
I believe that @mmcky and @AakashGfude are working on a MyST->ipynb converter that outputs commonmark markdown: https://github.com/QuantEcon/sphinx-tojupyter
perhaps that'd be useful?
medium-long term I am very hopeful we can get some support for MyST markdown (some of it anyway) inside of Jupyter interfaces (e.g. via work that @rowanc1 is doing or building off of the JupyterLab markdown-it extension that @agoose77 has worked on
Nice, thanks a lot for the pointers, I'll try to take a look at them!
Meh, I think this feels a little bit like "going round the houses". You could just have myst-parser identify HTML admonition, the same way it does for HTML images: https://github.com/executablebooks/MyST-Parser/blob/master/myst_parser/parse_html.py
Our main use case is for admonitions : we want to keep using admonitions in JupyterBook and we want them to look decent in Jupyter notebook interfaces.
MyST-Parser now has an extension to read HTML admonitions: https://github.com/executablebooks/MyST-Parser/pull/288 (https://myst-parser.readthedocs.io/en/latest/using/syntax-optional.html#html-admonitions)
Thanks, I may be missing something, but I don't really see how this helps having admonition looking decent in Jupyter notebook interfaces :thinking:.
I tried using a HTML admonition with the development Myst-Parser.
<div class="admonition note" name="html-admonition">
<p class="title">This is the **title**</p>
HTML admonition
</div>
The generated HTML does look good:
but this is how it looks in the classic Jupyter notebook interface:
To give an idea what my current conversion script does (https://github.com/executablebooks/MyST-NB/issues/148#issuecomment-748188786)
https://inria.github.io/scikit-learn-mooc/python_scripts/02_numerical_pipeline_hands_on.html
don't really see how this helps having admonition looking decent in Jupyter notebook interfaces
You can easily just add extra classes and/or inline styles:
<div class="admonition tip alert alert-warning">
<p class="title" style="font-weight: bold;">Tip</p>
parameter allows to get a deterministic results even if we
use some random process (i.e. data shuffling).
</div>
in jupyter lab:
<div class="admonition" style="background: lightgreen; padding: 10px">
<p class="title" style="; padding: 10px; font-weight: bold; border-color: green; border-style: solid">Tip</p>
parameter allows to get a deterministic results even if we
use some random process (i.e. data shuffling).
</div>
Ah good point thanks!
I guess inline styles are probably the best way to go, as they are deterministic (i.e. don't depend on the available CSS), then when it is converted in sphinx, the style
attribute will just be "thrown away", and it will be styled consistent with the sphinx theme you are using
I guess a limitation is that if you use markdown inside the HTML admonition, it will not render very nicely in Jupyter notebook interfaces.
<div class="admonition alert alert-warning">
<p class="title" style="font-weight: bold;">Tip</p>
`random_state` is **very important**
</div>
All in all, personally now that I have my hacky .py
-> .ipynb
conversion script with simple admonition support, I think I will stick to it (maybe sunk cost fallacy :wink:). The main advantages are:
The main disadvantage would be that it is a stand-alone hacky script and that his longer-term maintenance is less than clear.
For others though, HTML admonition may be exactly what they need.
@lesteve you can partially mitigate this by adding a newline above the Markdown:
Ah nice I did not think of trying that, thanks!
FYI, nbsphinx
parses <div>
elements with alert-info
and alert-warning
: see https://nbsphinx.readthedocs.io/en/0.8.1/markdown-cells.html#Info/Warning-Boxes.
This even works with LaTeX/PDF output.
A newline should still be used before the content, as mentioned above (and as mentioned in the nbsphinx
docs).
There are still problems with nbconvert
, though: https://github.com/jupyter/nbconvert/issues/1125
And there is some room for improvement regarding the CSS that's used in JupyterLab and the Classic Notebook.
Currently, when notebooks are created for a page, they end up in
jupyter_execute
, and are somehow able to be downloaded with thedownload-jupyter
role.I am trying to figure out the right way to expose download links for all notebooks so that themes can add the ability to download them. E.g. for this dropdown menu:
It seems that
download-jupyter:
creates a one-off hash for the notebook that wishes to be downloaded. Does it makes sense to do this for all notebook content? Is there a better way that I could do this?