Convert from notebook with sphinx directives to notebook with markdown/html

pelson commented 3 years ago

Sorry for the completely novice question, this is based on a few short minutes of experience of nbsphinx, but I couldn't see what I was looking for in the docs/examples.

I've long wanted to be able to document stuff using rST and sphinx concepts in notebooks, things like resolving cross-references and even inclusion of docstrings etc.. Just like when documenting with sphinx, I think it is reasonable that this involves an explicit transformation to inject the information into the resulting output (in this case, that would be another notebook).

From a technical perspective we can see sphinx as a two part transformation, first it converts special rST syntax using the context it gets given about the documentation project to standard rST, and then it transforms the rST to the desired output format.

From a brief look at the code, nbsphinx does the following:

convert the whole notebook to rST
let sphinx transform the generated rST as a normal document

In order to achieve the notebook -> notebook conversion though, you'd want to:

convert rST cells to rST documents
let sphinx transform the cell (perhaps to html)
inject the transformed content back into a notebook

Given your experience developing nbsphinx, I wanted to ask the question: "how far off is nbsphinx from being able to do that?" and if the answer is "a long way away", whether you can see any pitfalls with the suggested approach (e.g. do you not get enough context for things like ToC tables, etc.)?

chrisjsewell commented 3 years ago

Heya, to note this is possible with https://myst-nb.readthedocs.io/en/latest/, which "drives" https://jupyterbook.org. There is also some conversation about conversion to "plain" notebooks here: https://github.com/executablebooks/MyST-NB/issues/148

pelson commented 3 years ago

to note this is possible with https://myst-nb.readthedocs.io/en/latest/

Very cool! Thanks for letting me know @chrisjsewell! :+1:

I was just looking through the docs, and couldn't quite find what I was looking for:

Step 1: Take a notebook with standard sphinx declarations (e.g. :meth:`my_pkg.MyClass.my_method`) written in rST (happy to use the MyST syntax if it has to be done in markdown)
Step 2: Run it through a sphinx project such that links & docstrings etc. are generated
Step 3: Inject the generated content back into the appropriate cell in the notebook
Step 4: Behold that the notebook now has the relevant documentation for my_method + relative links to the key parts of the documentation also generated with sphinx (therefore the notebook fundamentally belongs to the sphinx built docs)

mgeier commented 3 years ago

I'm not quite sure if I understand what you want to do ... do you want to have a Jupyter notebook as both input and output of the process?

I guess so, because you are saying this:

In order to achieve the notebook -> notebook conversion though, you'd want to:

convert rST cells to rST documents

let sphinx transform the cell (perhaps to html)

inject the transformed content back into a notebook

Transforming the cell into an HTML snippet sounds like something Sphinx might be able to do (although it normally creates a whole bunch of HTML files). Probably the underlying docutils is better suitable for that?

What I don't understand, however, is how you want to store the generated HTML back in the Jupyter notebook?

Jupyter notebooks cannot contain true HTML, they can only contain a subset of HTML as part of Markdown.

They actually can contain arbitrary HTML in raw HTML cells, but I don't think that's what you want because this will not be rendered in e.g. JupyterLab.

How and in what context would you like to use those hypothetical notebooks with the replaced cells?

pelson commented 3 years ago

Perhaps an example of what I actually want to be able to do would help:

I want to store a notebook in a repository which contains sphinx style markup, e.g. :class:`my_pkg.MyClass` as part of the prose of a broader tutorial. This could be the canonical place that I want to document "MyClass", if that is the case, I would use something like nbsphinx or myst-nb to generate html from the notebook (this bit already works thanks to these great packages! 👍 ). However, I might also/alternatively want to be able to have this context in an interactive environment inside a notebook, for example, for use in mybinder, or just in a local Jupyter environment - this would be used in the context of a tutorial where it is desirable to have some of the documentation next to code cells. In this case I want the :class:`my_pkg.MyClass` markup to be replaced with the contents of what sphinx would generate from this. The result would be another notebook (a built notebook) with the sphinx syntax having been substituted for what sphinx would normally put in its build output (e.g. html).

The resulting notebook must necessarily either contain markdown cells (perhaps via sphinx-markdown-builder) or alternatively html which can hopefully fit into the subset of html that is supported in markdown cells in the notebook. I already appreciate that it is perfectly possible to have a code cell which gives this information via a call to help(), but this hampers the flow/readability - the prose (markdown cells) should be the place to provide this information, and the code cells should be the place to enter code, and are not the best place to read docs.

mgeier commented 3 years ago

Thanks for the more detailed description, that's very helpful!

I think I understand what you want to do and I see some useful aspects in it, but I don't think it's a good workflow because the notebooks lose information. If the "generated" notebooks are distributed rather than the original one (which might happen by accident), it will not be possible to update the hard-coded information in them.

Apart from that, I don't see a good way how any of this could be implemented (but I'm interested to know if somebody tries it!).

I would, as so often, settle for a compromise: I'd read the "documentation" part in the "rendered" version (e.g. via nbsphinx on RTD) and in a separate browser tab I'd actually work on the "real" notebook (either with JupyterLab or with Binder).

This is how I make links from notebooks to the API docs: https://nbsphinx.readthedocs.io/en/0.8.1/markdown-cells.html#Links-to-Domain-Objects

In JupyterLab, this takes you to the .rst file, which sadly is of very limited usefulness if you want to get to the API docs.

I already appreciate that it is perfectly possible to have a code cell which gives this information via a call to help()

This reminds me of a feature that apparently only partly survived the way to JupyterLab:

When pressing Shift+Tab after typing a function name, a little information window appears.

In the Classic Notebook, when pressing Shift+Tab+Tab (and even more Tabs), the display changed (to contain more text, a scrollbar, etc.).

IIRC correctly, there was some discussion many years ago to try to nicely render the docstrings as HTML in this "info window", including the contained reST markup.

I think this never really took off, but maybe there was an extension at some point?

I tried to search for it, but I just found a few other related things:

https://github.com/ipython/ipython/issues/888

https://github.com/ipython/ipython/issues/3714

https://github.com/ipython/ipython/pull/4301

Now that I'm thinking about it ... it works quite well to include HTML <iframe>s in a notebook (online example):

from IPython.display import IFrame
IFrame('my-great-page.html', width='100%', height=350)

I don't know how exactly, but you could somehow render your docs (or whatever) as external HTML files and include them like this?

This would allow you to use arbitrary HTML and your notebook would still not be encumbered with hard-coded auto-generated stuff.

pelson commented 3 years ago

I don't think it's a good workflow because the notebooks lose information.

If the "generated" notebooks are distributed rather than the original one (which might happen by accident), it will not be possible to update the hard-coded information in them.

I take that. Though I could imagine though having a special syntax which preserved the original input as well as the output.

Ultimately notebooks have the characteristic already when we think of code cells...

:thinking: - I'd definitely be interested in hearing about it if somebody put such a prototype together, but I think the issue doesn't need to remain open for that (it isn't an issue with nbsphinx in itself).

spatialaudio / nbsphinx

Convert from notebook with sphinx directives to notebook with markdown/html #531