gpoore / minted

minted is a LaTeX package that provides syntax highlighting using the Pygments library. Highlighted source code can be customized using fancyvrb.
1.73k stars 125 forks source link

[bug] `gobble` results in malformed formatting, `autogobble` works #379

Open goyalyashpal opened 7 months ago

goyalyashpal commented 7 months ago

image

MWE:

Note: here

\begin{filecontents}[noheader, overwrite]{./test.mysql}
    mysql-> select * from firstLine;
    mysql-> select * from secondLine;
    mysql-> \help contents
    mysql-> \h contents
\end{filecontents}

\documentclass{article}
\usepackage{minted}

\begin{document}

\inputminted[autogobble]{psql}{test.mysql}    % Works
\inputminted[gobble=4]{psql}{test.mysql}    % Buggy

\end{document}

just for my reference, first noticed filecontents at https://github.com/gpoore/minted/issues/378#issuecomment-1794025908

muzimuzhi commented 7 months ago

Seems a pygments bug.

$ cat test.mysql
    mysql-> select * from firstLine;
    mysql-> select * from secondLine;
    mysql-> \help contents
    mysql-> \h contents

$ cat test_noindent.mysql
mysql-> select * from firstLine;
mysql-> select * from secondLine;
mysql-> \help contents
mysql-> \h contents

# see outputs in screenshot below
$ pygmentize -l psql -F gobble:n=4 test.mysql

$ pygmentize -l psql test_noindent.mysql

image

muzimuzhi commented 7 months ago

minted option gobble=4 is passed to pygments executable pygmentize as CLI flag -F gobble:n='4'. But unfortunately, pytments "filters are applied after lexing" (from https://github.com/pygments/pygments/issues/585#issuecomment-526848356).

Also see pygments doc on filters

An arbitrary number of filters can be applied to token streams coming from lexers to improve or annotate the output.

Hence if an indented PostgreSQL console (psql) should be correctly tokenized, it's PostgresConsoleLexer to blame.

goyalyashpal commented 7 months ago

hi! thanks a lot for such thorough investigation..

if an indented PostgreSQL console (psql) should be correctly tokenized, it's PostgresConsoleLexer to blame.

ehm, no. 'cz the console input-output is never indented i.e. false at the "if" itself. so, the pgconsole lexer is perfectly fine the way it works.

minted option gobble=4 is passed to pygments executable pygmentize as [a filter option]

if minted can autogobble, then can it not also gobble?

But unfortunately, pytments "filters are applied after lexing" (from pygments/pygments#585 (comment)).

the issue lies here only.

goyalyashpal commented 7 months ago

regardless of where the solution is supposed to come; i think this behaviour is worth a place in documentation:

gpoore commented 7 months ago

gobble and autogobble behave differently because only gobble is a Pygments filter. All of this will change in minted v3.0, since there will be a minted executable where filter-type behavior can be implemented separately from Pygments, both before and after lexing. At that point, all gobbling can be implemented in a uniform fashion. I'm about to release a new LaTeX data serialization package that will be the foundation of much of minted v3.0, and then will be able to start working on the new executable.