spatialaudio / nbsphinx

:ledger: Sphinx source parser for Jupyter notebooks
https://nbsphinx.readthedocs.io/
MIT License
453 stars 130 forks source link

Failure on multi-line maths with leading indentation #753

Open AA-Turner opened 1 year ago

AA-Turner commented 1 year ago

Hi,

I've found an error with multi-line Markdown maths being translated to reST.

Take a notebook (reproducer.ipynb) with a single markdown cell:

# Reproducer: Multi-line maths with indentation

\begin{eqnarray*}
a &=& b, \\
\c &=& \cases{d \texttt{ if } e < 0 \\
              f \texttt{ if } g \geq 0}, \\
h &>& i, \\
j &\sim& k.
\end{eqnarray*}

and a Sphinx project with index.rst and conf.py as follows:

.. toctree::

   reproducer.ipynb
extensions = ["nbsphinx"]
nbsphinx_execute = "never"

Rendering this project with sphinx-build -M html reproducer/ build/ -TEa yields the following warnings:

reproducer.ipynb:14: ERROR: Unexpected indentation.
reproducer.ipynb:11: WARNING: Inline interpreted text or phrase reference start-string without end-string.
reproducer.ipynb:15: WARNING: Block quote ends without a blank line; unexpected unindent.
l

This is as the Docutils Inliner sees the following markup, which does not have a closing grave accent (`)

:nbsphinx-math:`\begin{eqnarray*}
a &=& b, \\
\c &=& \cases{d \texttt{ if } e < 0 \\

This is parsed from the following translated reST source (dumped from convert_pandoc())

Reproducer: Multi-line maths with indentation
=============================================

:nbsphinx-math:`\begin{eqnarray*}
a &=& b, \\
\c &=& \cases{d \texttt{ if } e < 0 \\
              f \texttt{ if } g \geq 0}, \\
h &>& i, \\
j &\sim& k.
\end{eqnarray*}`

Docutils splits text into paragraphs before detecting roles or inline markup, and the leading indentation in line 7 is a blockquote (and not part of the previous paragraph), according to the reST specification.

The most appropriate solution would be to use a reST directive (e.g. .. nbsphinx-equation::) for multi-line mathematics, but I believe that as LaTeX discards source spacing in equations, this could also be addressed by the following patch in object_hook():

-            obj = {'t': 'RawInline',
-                   'c': ['rst', ':nbsphinx-math:`{}`'.format(obj['c'][1])]}
+            stripped = '\n'.join(l.strip() for l in obj['c'][1].split('\n'))
+            obj = {'t': 'RawInline',
+                   'c': ['rst', f':nbsphinx-math:`{stripped}`']}

Thanks, Adam