spatialaudio / nbsphinx

:ledger: Sphinx source parser for Jupyter notebooks
https://nbsphinx.readthedocs.io/
MIT License
453 stars 130 forks source link

What's the best way to add support for sphinx directives in markdown cells? #762

Open peytondmurray opened 1 year ago

peytondmurray commented 1 year ago

I have a lot of existing notebooks that have sphinx directives embedded in their markdown cells - stuff like

{meth}`foo_method <foo.bar.Baz.foo_method>`

or

{ref}`this code snippet <this-py>`

I know I can turn these into properly supported links using [this code snippet](path_to_file.rst#this-py) or similar, but the code base I'm working with is huge, and has previously used myst-nb to do this, so replacing every reference would be a serious effort. I'm trying to think about other ways to handle this with nbsphinx; what's the best way to handle this?

mgeier commented 1 year ago

Currently, there is no automatic way to do that. For now it might be best to keep using MyST-NB.

In the long run, I would like to support custom parsers for Markdown cells, but this is a big undertaking. I think the first requirement is to get rid of Pandoc and directly transform notebook cells into the internal docutils representation (see #36). Once we have this, we can provide customization points for custom Markdown cell parsers.

peytondmurray commented 1 year ago

Yep, getting rid of pandoc would be nice. I'd be down to help out with this, I think first I'd like to spend a little time looking at how we might be able to break this down given the size of the task.

For now, from what I can tell a string like

{ref}`this code snippet <this-py>`

gets ingested by nbconvert using pandoc as a transform to turn it into the intermediate RST. I'm curious why it doesn't then pick up the references when the intermediate RST is converted to html though :thinking:

EDIT: Okay, after looking at this more closely, it seems like pandoc itself is responsible for converting

{ref}`this code snippet <this-py>`

into

{ref}\\ ``this code snippet <this-py>``

Which is not a valid sphinx reference. I tried implementing a pandoc filter that does what I want, but filters only operate on single objects (i.e. words) so this isn't a practical approach.

mgeier commented 1 year ago

I agree that a pandoc filter isn't the right approach. Pandoc would need to support MyST as a proper input format. I don't know if that's ever going to happen. The only relevant things I found in the issue tracker are these: https://github.com/jgm/pandoc/issues/7622, https://github.com/jgm/commonmark-hs/issues/100.

Anyway, making this work with Pandoc would only be a temporary solution. In the long run, we need a Python library that converts Jupyter Markdown cells directly into the docutils internal representation.