executablebooks / MyST-Parser

An extended commonmark compliant parser, with bridges to docutils/sphinx
https://myst-parser.readthedocs.io
MIT License
756 stars 196 forks source link

Translations of formulas not working #980

Open MacqGit opened 1 month ago

MacqGit commented 1 month ago

What version of myst-parser are you using?

4.0.0

What version dependencies are you using?

Sphinx == 8.0.2; docutils == 0.21.2

What operating system are you using?

Linux

Describe the Bug

Content of formulas is not translated, leading to empty formulas

Expected Behavior

Formulas content translated according to i18n language choice.

Note: FINDINGS added in TO REPRODUCE section

To Reproduce

index.md:

Test

:::{eval-rst}
.. math::

      250\:km/h = 69.44\:m/s

      \frac{69.44\:m/s}{30\:frames/s} = 2.31\:m/frame
:::

index.rst:

Test

.. math::

      250\:km/h = 69.44\:m/s

      \frac{69.44\:m/s}{30\:frames/s} = 2.31\:m/frame

_In local_root/locales/pt/LCMESSAGES > index.po

#: ../../source/index.md:4
msgid "250\\:km/h = 69.44\\:m/s\n"
"\n"
"\\frac{69.44\\:m/s}{30\\:frames/s} = 2.31\\:m/frame"
msgstr "250\\:Test/h = 69.44\\:m/s\n"
"\n"
"\\frac{69.44\\:m/s}{30\\:Test/s} = 2.31\\:m/frame"

#: ../../source/index.rst:4
msgid "250\\:km/h = 69.44\\:m/s\n"
"\n"
"\\frac{69.44\\:m/s}{30\\:frames/s} = 2.31\\:m/frame"
msgstr "250\\:Test/h = 69.44\\:m/s\n"
"\n"
"\\frac{69.44\\:m/s}{30\\:Test/s} = 2.31\\:m/frame"

HTML result in "EN":

image

HTML result in "PT" (or any other language):

image

FINDINGS: In /sphinx/transforms/i18n.py > line 391

            # literalblock need literal block notation to avoid it become
            # paragraph.
            if isinstance(node, LITERAL_TYPE_NODES):
                msgstr = '::\n\n' + indent(msgstr, ' ' * 3)

            patch = publish_msgstr(self.app, msgstr, source,
                                   node.line, self.config, settings)

Creates a msgstr:

str: ::

   250\:Test/h = 69.44\:m/s

   \frac{69.44\:m/s}{30\:Test/s} = 2.31\:m/frame

which is then parsed by the Myst-Parser that parse it as a "block" returning the following list of tokens:

0   Token: Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[0, 1], level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    
1   Token: Token(type='inline', tag='', nesting=0, attrs={}, map=[0, 1], level=1, children=[], content='::', markup='', info='', meta={}, block=True, hidden=False) 
2   Token: Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    
3   Token: Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[2, 3], level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    
4   Token: Token(type='inline', tag='', nesting=0, attrs={}, map=[2, 3], level=1, children=[], content='250\\\\:Test/h = 69.44\\\\:m/s', markup='', info='', meta={}, block=True, hidden=False) 
5   Token: Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    
6   Token: Token(type='paragraph_open', tag='p', nesting=1, attrs={}, map=[4, 5], level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    
7   Token: Token(type='inline', tag='', nesting=0, attrs={}, map=[4, 5], level=1, children=[], content='\\\\frac{69.44\\\\:m/s}{30\\\\:Test/s} = 2.31\\\\:m/frame', markup='', info='', meta={}, block=True, hidden=False)  
8   Token: Token(type='paragraph_close', tag='p', nesting=-1, attrs={}, map=None, level=0, children=None, content='', markup='', info='', meta={}, block=True, hidden=False)    

which gives a 'doc' variable content (/sphinx/transforms/i18n.py > line 74) with following childrens

0   paragraph: <paragraph>::</paragraph>    
1   paragraph: <paragraph>250:Test/h = 69.44:m/s</paragraph>    
2   paragraph: <paragraph>\\frac{69.44:m/s}{30:Test/s} = 2.31:m/frame</paragraph>   

and eventually leads doc[0] (in "publish_msgstr" : /sphinx/transforms/i18n.py > line 81) to return :: which is obviously not in line with what Sphinx purposedly indicates: " # literalblock need literal block notation to avoid it become

paragraph."

Assuming that the section of the Sphinx code dealing with the prepending of :: was added in commit 67dd1c0d from Takayuki SHIMIZUKAWA shimizukawa@gmail.com Date: Sun Feb 22 18:30:45 2015 +0900

This might be something that has been kept hidden for some time

chrisjsewell commented 1 month ago

Heya, you don't need to use eval-rst for this, that is a "last resort":

```{math}
250\:km/h = 69.44\:m/s

\frac{69.44\:m/s}{30\:frames/s} = 2.31\:m/frame


I believe this should work fine
MacqGit commented 1 month ago

Hello Chris,

I'm afraid that this doesn't provide the correct output either.

When the "sphinx.transforms.i18n.Locale" is called and applied, the "apply" method of Sphinx "Locale" class inserts an additional character set :: to the "math_block" node (which is indeed a LITERAL_TYPE_NODES) (see: /sphinx/transforms/i18n.py line 394) before passing it (alongside with the modified msgstr) to the "publish_msgstr" method. The Myst-Parser is then called to decode a string embedding ::, which symbolic belongs to the rst meta-language. This part is left untouched which leads to an incorrect interpretation from the actual math content.

Hope I'm on the right side, but this is what the debug trace shows me.

If I understand it well, this issue may arise from the fact that Sphinx assumes that only RST parser is going to be used (which, in this case delivers a correct result) but this is not the case here.

Thanks anyway to the attention you bring to this case.

BRGDS,

Bernard

chrisjsewell commented 1 month ago

If I understand it well, this issue may arise from the fact that Sphinx assumes that only RST parser is going to be used

ok I see cheers, this PR comes to mind then https://github.com/sphinx-doc/sphinx/pull/12238 which I think would fix it, should push that forward at some point

MacqGit commented 1 month ago

EDIT:

As per here below, I made a quick test and everything seems fine: translations happen correctly. So my conclusion ia that there is no need to "update" the Myst-Parser side or it already contains necessary adaptations...

Many thanks Chris !

INITIAL POST:

Thanks Chris, I went through the PR comments and indeed believe that you pointed me in the right direction. Now, it looks that solving the "rST hard-coded" elements in Sphinx is not as easy as it may appears.

Anyway, I will give a try testing this PR as I have a test environment at hand and give you a feedback.

However, from what I read, it looks like implementation changes are required on both Sphinx & Myst-Parser sides. Right ? In this case, is there a specific PR/commit to apply for the Myst-Parser ?