jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.72k stars 564 forks source link

Broken latex equations on convert #404

Open cmbant opened 8 years ago

cmbant commented 8 years ago

Some latex code works fine in Jupyter notebook, but is not compiled correctly on export. e.g.

For the bispectrum contributions, we have
$$
\langle \phi(l_1)\phi(l_2)\omega(l_3)\rangle = (2\pi)^2 \delta (l_1+l_2+l_3) b_{l_1,l_2,l_3}^{\phi\phi\omega}
$$
where
$$
 b_{l_1,l_2,l_3}^{\phi\phi\omega} = -8\, l_1\times l_2 \,l_1\cdot l_2 \int_0^{\chi_*} d\chi \frac{W(\chi,\chi_*)^2}{\chi^2} \int_0^{\chi}d\chi' \frac{W(\chi',\chi)W(\chi',\chi_*)}{{\chi'}^2} \left[
 P_\psi\left(\frac{l_1}{\chi},z(\chi)\right) P_\Psi\left(\frac{l_2}{\chi'},z(\chi')\right)
- (l_1\leftrightarrow l_2)
\right]
$$
Now defining
$$
M_*(l,l') \equiv l^4\int_0^{\chi_*}d\chi \frac{W(\chi,\chi_*)^2}{\chi^2} P_\Psi\left(\frac{l}{\chi}, z(\chi)\right) C_{l'}^\kappa(\chi,\chi_*)
$$
we have
$$
b_{l_1l_2l_3}^{\kappa\kappa\omega} = -\sin 2\phi_{21}\left[ M_*(l_1,l_2)-M_*(l_2,l_1)\right]
$$

In the notebook it looks like this: image

After HTML export it looks like this:

image

mpacer commented 8 years ago

Ok there's something weird happening here.

The issue has to do with the way that you broke lines in the middle of the second equation.

So, if all you care about is that this particular code doesn't work then don't use

$$
 b_{l_1,l_2,l_3}^{\phi\phi\omega} = -8\, l_1\times l_2 \,l_1\cdot l_2 \int_0^{\chi_*} d\chi \frac{W(\chi,\chi_*)^2}{\chi^2} \int_0^{\chi}d\chi' \frac{W(\chi',\chi)W(\chi',\chi_*)}{{\chi'}^2} \left[
 P_\psi\left(\frac{l_1}{\chi},z(\chi)\right) P_\Psi\left(\frac{l_2}{\chi'},z(\chi')\right)
- (l_1\leftrightarrow l_2)
\right]
$$

instead, don't break the line there.

$$
 b_{l_1,l_2,l_3}^{\phi\phi\omega} = -8\, l_1\times l_2 \,l_1\cdot l_2 \int_0^{\chi_*} d\chi \frac{W(\chi,\chi_*)^2}{\chi^2} \int_0^{\chi}d\chi' \frac{W(\chi',\chi)W(\chi',\chi_*)}{{\chi'}^2} \left[
 P_\psi\left(\frac{l_1}{\chi},z(\chi)\right) P_\Psi\left(\frac{l_2}{\chi'},z(\chi')\right) - (l_1\leftrightarrow l_2) \right]
$$

But this illustrates that there's a bug in our parser.

Somehow this is tricking our parser into thinking that some of the subscript commands indicated by an underscore (_) are no longer LaTeX subscript notation but are standard markdown italics and so it's trying to insert <em> tags there instead of rendering them raw. Also it introduces a list in a weird way.

mpacer commented 8 years ago

Upon further investigation it seems to be entirely because of the list.

So

$$
 b_{l_1,l_2,l_3}^{\phi\phi\omega} = -8\, l_1\times l_2 \,l_1\cdot l_2 \int_0^{\chi_*} d\chi \frac{W(\chi,\chi_*)^2}{\chi^2} \int_0^{\chi}d\chi' \frac{W(\chi',\chi)W(\chi',\chi_*)}{{\chi'}^2} \left[
 P_\psi\left(\frac{l_1}{\chi},z(\chi)\right) P_\Psi\left(\frac{l_2}{\chi'},z(\chi')\right) -
 (l_1\leftrightarrow l_2)
\right]
$$

should work as well.

So it seems that what is wreaking havoc is that the markdown list parsing code is getting a little trigger happy and needs to be silenced inside of a LaTeX block.

I would have thought that this was showing up in conversion because its a pandoc thing. @carreau is that something we handle internally or is this a pandoc bug?

herm commented 7 years ago

Another problem with the parser: Using \begin{equation*} works in the notebook, but it is rendered as plain text in the html output.

danilobellini commented 6 years ago

This has nothing to do with pandoc.

I've found that a simple $$x = 2$$ can break the Markdown parser if it's broken in 3 lines:

$$x
=
2$$

The block parser is detecting that equal sign (or plus, minus, asterisk, etc.) as another Markdown block, that's internal to the mistune library, but that's not a mistune bug, as it doesn't render math blocks. Therefore, the nbconvert inline parser doesn't even "see" the whole block, it doesn't know that such a block has an end. The starting $$ is then rendered as a simple unmatched text, but the remaining stuff (the equation) gets rendered as plain Markdown. When the trailing $$ gets parsed, it might even get joined to a further equation that started in continuation.

I'm going to make a pull request keeping the mistune library to at least fix this issue. However, I'm quite sure its rendering still won't behave like the Jupyter Notebook for any valid input. A better but hard solution would be either to translate to Python the Markdown rendering engine found in the Jupyter Notebook code (written in JavaScript AFAIK), or to use it directly as is (requiring some JavaScript interpreter).

takluyver commented 6 years ago

Mistune is already meant to be a close equivalent to marked, the JS markdown parser we use.