CatalaLang / catala

Programming language for literate programming law specification
https://catala-lang.org
Apache License 2.0
1.99k stars 77 forks source link

Pygments lexer breaks LaTeX escapeinside #589

Open pierregoutagny opened 9 months ago

pierregoutagny commented 9 months ago

I'm trying to use the Pygments lexer in syntax_highlighting/en/pygments. With Pygments version 2.16.1 (I haven't tried other versions, but this is close to being the latest), running pygmentize -l 'catala_en' -f latex -P escapeinside='!!' example.catala_en on the following file:

```catala
declaration scope A:
  input x content integer!\label{line:x}!
renders the following:
````tex
\begin{Verbatim}[commandchars=\\\{\},codes={\catcode`\$=3\catcode`\^=7\catcode`\_=8\relax}]
\PY{l+s}{```catala}
\PY{k+kr}{declaration}\PY{l+s}{ }\PY{k+kr}{scope}\PY{l+s}{ }\PY{n+nc}{A}\PY{o}{:}
\PY{l+s}{ }\PY{l+s}{ }\PY{k+kd}{input}\PY{l+s}{ }\PY{n+nv}{x}\PY{l+s}{ }\PY{k+kr}{content}\PY{l+s}{ }\PY{k+kt}{integer}\PY{err}{!}\PY{l+s}{\PYZbs{}}\PY{k+kr}{label}\PY{o}{\PYZob{}}\PY{n+nv}{line}\PY{o}{:}\PY{n+nv}{x}\PY{o}{\PYZcb{}}\PY{err}{!}
\PY{l+s}{```}
\end{Verbatim}

The important part here being that the escaped LaTeX code is rendered (line 4) as \PY{err}{!}\PY{l+s}{\PYZbs{}}\PY{k+kr}{label}\PY{o}{\PYZob{}}\PY{n+nv}{line}\PY{o}{:}\PY{n+nv}{x}\PY{o}{\PYZcb{}}\PY{err}{!}. Instead, this should be \PY{esc}{\label{line:x}} so that it is indeed escaped when using eg the minted LaTeX package.

Given that Pygment's doc states that the escapeinside option has "no effect in string literals", I would suspect that some uses of the String token in lexer.py may be responsible for this behavior.

rmonat commented 9 months ago

I can reproduce in Pygments version 2.14.0. Looking back at previous papers, it seems escapeinside has been used in the ICFP paper, but I think the pygmentize scripts have been changed since then.

denismerigoux commented 9 months ago

Even for the ICFP paper having the !\label{line:x}! work with the Catala pygments lexer was a huge pain. I wouldn't know how to fix that now that our pygments workflow has changed (@AltGr), I suggest as a workaround to simply hardcode the line number you want to refer to in the paper...

rmonat commented 9 months ago

Yes, that's the current workaround. For text documents that not too bad, but for beamer it's nice to be able to insert tikzmarks

pierregoutagny commented 9 months ago

Following my last remark on the String token being a possible culprit, I tried simply replacing it with Text everywhere in the lexer, and it seems to work. I don't know if it breaks other things or if the visual result is exactly what was expected (for example I think ‌```catala does not have the same color), but it is enough in my environment for now, and less painful than writing numbers by hand, for the small price of a text substitution. I haven't tried using this in beamers, but I would expect it to work.

pierregoutagny commented 1 week ago

Coming back to this after having to battle with minted for a beamer presentation. Maybe I can open a PR with my small fix (which works for my paper and beamer) and you can check that it doesn't break any other workflow (syntax.pdf or other things I don't know about) if and when you have time.