Open akhmerov opened 4 years ago
Thanks for opening your first issue here! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out EBP's Code of Conduct. Also, please try to follow the issue template as it helps other community members to contribute more effectively.
If your issue is a feature request, others may react to it, to raise its prominence (see Feature Voting).
Welcome to the EBP community! :tada:
well I think one of the main challenges we're going to have is in implementation - we use Jupytext to convert MyST markdown notebooks to ipynb
files, and it seems like this would require substantially changing the way that this conversion happens and would also break with the model that Jupyter has for notebook structure, which seems like a lot of work to me that would require a big reason to undertake.
regarding separating source and outputs, why couldn't you do something like:
```{code-cell}
:tags: remove-cell
a = "hi"
glue("hivar", a)
```{glue} hivar
Is the main problem there that you want the code inside the admonition as well? If that's the biggest dealbreaker then I wonder if there's a way we could allow glue-like functionality on a cell instead of just a single variable. That seems like an easier lift than totally re-working how markdown becomes a notebook
The constraint of 1-1 correspondence to a notebook format is limiting or inconveniencing the content creators
This is not an inconvenience, it is a design principle, that was agreed upon very early on.
It is also based around "separation on concerns": code execution vs output rendering.
I'm happy to discuss further though, once someone actually provides a credible prototype, which allows for among over things; Markdown to notebook conversion and execution caching
@chrisjsewell what do you think about making it possible to paste a cell input, output, or both via a glue
directive? E.g., we could support a :name:
variable in code-cell
(or in the notebook cell metadata) that would let you refer to it elsewhere. That seems like it shouldn't be too tough to me and wouldn't break our document model for notebooks
Re @choldgraf
Is the main problem there that you want the code inside the admonition as well?
That is indeed one aspect of the problem. Here's a live example of a code cell inside a directive: https://gitlab.kwant-project.org/kwant/kwant/-/blob/master/doc/source/tutorial/spin_potential_shape.rst#L595-645
I agree that allowing to reinsert full cells using glue would address the ability to render the content as desired. It wouldn't however make the authoring easy and would not eliminate the boilderplate.
Would the reverse work though? E.g. when myst-markdown is read, and the parser encounters a cell at a place where it may not occur in a notebook, the cell is moved in the doctree to where it may belong for the purpose of notebook generation, and only rendered at its location in the source?
Re: @chrisjsewell
This is not an inconvenience, it is a design principle, that was agreed upon very early on.
A design decision can be additionally an inconvenience. Yes, I remember that it is a design decision, and I did mention that it in the issue description. Now I am pointing out how that decision influences the latest developments.
It is also based around "separation on concerns": code execution vs output rendering.
That part I don't buy. I actually started (a while back) with an implementation that had perfectly separated concerns: scripts generating outputs, and a completely independent documentation. A large selling point of executable books (and jupyter notebooks alongside) is that separation of execution and rendering is hugely inconvenient when working on documentation or anything with a narrative.
I'm happy to discuss further though, once someone actually provides a credible prototype
Should this then be a "help wanted" label rather than "wontfix"? I'm getting mixed signals here :thinking:
...which allows for among over things; Markdown to notebook conversion and execution caching
What exactly do you mean with the first part? The current notebooks that jb
generates don't fully represent the source (a lot of syntax is missing, glue may not be rendered, etc). How far from the original may the notebooks go?
How far from the original may the notebooks go?
It simple really, you should be able to generate all the outputs from the notebook, and they should be ~identical to those you generate via the text file.
As an example what happens if you write:
```{python}
a = 1
python a=2
print(a)
Now if you run the notebook, you would get a different output to if you run the file through sphinx.
How do you deal with this?
There are plenty of issues like this that you won't figure out until you actually try to implement the code.
what do you think about making it possible to paste a cell input, output, or both via a glue directive? E.g., we could support a :name: variable in code-cell
yes that is something has come up before I believe, and I would be interested in implementing. I would call it something different to glue
though, so as not to confuse the two processes
well I think one of the main challenges we're going to have is in implementation - we use Jupytext to convert MyST markdown notebooks to ipynb files
Can you please provide some pointers to the code? I can't find where jupytext is actually used, while myst_nb.converter.myst_to_notebook
seems to create a notebook directly.
Jupytext is not directly used, but actually now the code they use to convert is almost exactly the same, as of the recent jupytext 1.6.0 release.
It simple really, you should be able to generate all the outputs from the notebook, and they should be ~identical to those you generate via the text file.
That sounds very reasonable. What do you think about the option I outlined above, where all the nested code cells are reinserted in the doctree at the top level?
I mean myst input that looks like
```{code-cell}
a = 1
Text text {code-cell}a = 2
text text.
generates 2 code cells, with a = 1
and a = 2
as sources respectively, followed by a markdown cell with e.g. content Text text {code-cell id=2}`a = 2` text text.
Would that, or something along similar lines, be a feasible approach?
Not to be rude 😬, but my personal view-point is that it's a bit of a waste of time discussing this in any depth, until there is an actual coded prototype of this working on the table. If you make a PR to jupytext showing this, then I'm absolutely happy to discuss.
But do you not see how this requires that, just to convert the Markdown to a notebook, you now have to set up a whole sphinx environment, search for all these nested code cells, somehow decide where to position these new code cells and reverse engineer the doctree to create the output notebook.
I apologize for wasting your time.
Just to clarify my earlier message, and not to anyhow continue this pointless (without a prototype) discussion, I didn't mean the sphinx doctree, just the output of the parser.
Its not necessarily a waste of time to discuss but, call me atheist, I just don't believe things until I see them lol.
I didn't mean the sphinx doctree, just the output of the parser.
Going back to your first comment:
````{warning} This cell was hidden with the toggle class
this will raise an error
```{code-cell}
raise RuntimeError
well the Markdown parser doesn't parse anything inside code fences, so it wouldn't "find" this code cell, for that you need to run it through sphinx.
In summary, a kernel agnostic alternative to glue 👍 re-writing the whole of the EBP stack to a different design; it won't be me doing it 😉
OK, I now have an overall understanding of the current implementation as well as a partial understanding of the reasons behind it.
Would it be a good idea to use sphinxcontrib-tomyst by @mmcky instead of markdown-it-py be a desirable change? In addition to allowing to resolve this issue, it would improve the quality of the notebook markdown content by making the markup more compatible.
You might want to re-check that understanding lol: sphinxcontrib-tomyst aims to convert restructuredtext to markdown, it has nothing to do with parsing Markdown
Indeed, but myst-parser does generate sphinx doctrees from myst files. After that has happened, sphinxcontrib-tomyst is able to generate markdown, and by extension notebooks from the sphinx doctrees.
PS Please use a less sarcastic tone, I would really appreciate that.
I was honestly just trying to be light-hearted not sarcastic. But the point is here is that this is diverging even further from my comment https://github.com/executablebooks/meta/issues/143#issuecomment-695357773, about keeping it simple and not involving sphinx. This proposal is an even more complex conversion procedure: convert from Markdown to Tokens (markdown-it-py), convert from Tokens to a doctree (myst-parser), convert from a doctree back to Markdown (sphinxcontrib-tomyst), an additional step/package to convert to the JSON notebook.
Hey all - just a friendly note that we're all on the same team here, I don't think anybody is trying to talk down to anybody else, but apologies to @akhmerov if we're coming across that way.
Can you elaborate why it is problematic to use sphinx? We're discussing the behavior of a sphinx extension after all. All the relevant contexts are passed the full sphinx app. There seems to be an implicit assumption that I miss.
Your description of the pipeline I mentioned sounds about right. I'm not sure why it much more complex than what is done currently overall: the myst-parser uses markdown-it-py internally anyway, and sphinx already calls myst-parser to create the doctrees. I realize that generation and writing of the notebooks in the current implementation happens before sphinx had a chance to generate the doctrees, is that the origin of the problem?
It does seem that using sphinxcontrib-tomyst would bring advantages in addition to allowing to generate notebooks with code cells matching a more complex document structure. Specifically it would be able to serve users with:
Does this not make it a desirable change?
Look, all I can suggest is that you try implementing this (maybe in jupyter-sphinx?) and come back with a solid prototype. Personally, I think you'll never get this to work, but I'm happy to be proved wrong.
Let me perhaps try phrasing the question differently in hopes of getting some clarity and an answer.
Is it correct that myst-nb has an explicit constraint that the result of its processing and executing myst files must be equivalent to converting these files specifically with jupytext, executing within a notebook, saving them as notebooks, and then feeding to myst-nb as inputs, but skipping execution?
The documentation is not explicit about that constraint. It mentions a more loose requirement of
MyST Notebooks have a 1-to-1 mapping with the notebook, so can be converted to .ipynb files and opened as notebooks in Jupyter interfaces (with jupytext installed).
One of the project goals is to allow generation of notebooks, based on the input files. IIUC, this is the main reason why cell directives are only allowed at the highest level, and if they aren't, the doctree is manipulated by bringing the cells to the lowest doctree level. However, as the project matures, more and more content requires breaking this pattern. A simple example of that is displaying a cell inside an admonition:
This pattern, however, is much more common. The most recent release recommends including execution outputs inside a
tabbed
directive by usingglue
. This, while partially achieving the goal, makes the source nonlocal (the code producing the output is located at a different place, and only referenced from where it is used), and also does not allow the user to show the input that generates the output, unlike thecode-cell
directive.This brings me to
glue
.glue
aims to do two things:I have no opinion about the former, but the latter is once again forcing the same suboptimal pattern of separating input and output. Compare the myst and Rmarkdown examples of inserting a computed value in text.
<rest of the paragraph>
Using these assumptions we compute the maximal airplane velocity as {glue:}max_velocity
, however this still requires that the approximations stay valid.<rest of the paragraph>
Overall I think that, already as we discussed earlier, the constraint of 1-1 correspondence to a notebook format is limiting or inconveniencing the content creators, and I think the practice shows that this limitation does not go away as the content becomes more complex.
Hence my proposal.