brucemiller / LaTeXML

LaTeXML: a TeX and LaTeX to XML/HTML/ePub/MathML translator.
http://dlmf.nist.gov/LaTeXML/
Other
863 stars 93 forks source link

Missing line break in text converted from fancyvrb environments #2344

Open piccardi opened 2 months ago

piccardi commented 2 months ago

I'm trying to get an html and epub version for a book on Linux system administration, and I got an almost complete versione except for a problem with fancyvrb environments, from which I got a conversion missing all line break.

I used this short extract to demontrate the problem, this is the tex source, renamed in txt because GitHub do not accept .tex files:

latexmltest-tex.txt

I got the following HTML conversion (still renamed to txt for the same reason):

latexmltest-html.txt

after executing the following commands:

piccardi@monk:~$ latexml --dest=latexmltest.xml latexmltest.tex --includestyles 
latexml (LaTeXML version 0.8.7) processing latexmltest.tex
Info:unexpected:titlepage When using titlepage, Frontmatter will not be well-structured at latexmltest.tex; line 23 col 0
Conversion complete No obvious problems (reqd. 1.30s)
piccardi@monk:~$ latexmlpost --dest=latexmltest.html latexmltest.xml
latexmlpost (LaTeXML version 0.8.7) paginating /home/piccardi/latexmltest.xml
Postprocessing complete No obvious problems (reqd. 0.18s)

I'm using latexml 0.8.7 from Debian 12 (stable) package; as you can see in the HTML file, the text block:

objectclass ( 1.3.6.1.4.1.1466.344 NAME 'dcObject'
        DESC 'RFC2247: domain component object'
        SUP top AUXILIARY MUST dc )

inside the Example block (that's a simple environment created from fancyvrb, defined in the tex attached file) is converted in a single line in the HTML file. Converting from plain verbatim environment works fine, but I loss all margins and formatting I'm using in other fancyvrb derived environments.

The problem seems to be just the line break, using more complex environment like:

\DefineVerbatimEnvironment{Console}{Verbatim}
{commandchars=\\\{\},xleftmargin=\parindent,xrightmargin=\parindent,
fontsize=\footnotesize}

respect the use of \textbf{XXX} inside Console block, giving bold XXX text in results.

dginev commented 2 months ago

Here is the core XML for the mentioned content:

<p>
<text font="typewriter">
  <text width="0.0pt"/>
  <text width="469.8pt">objectclass ( 1.3.6.1.4.1.1466.344 NAME ’dcObject’</text>
</text>
  <text font="typewriter" width="469.8pt">        DESC ’RFC2247: domain component object’</text>
  <text font="typewriter" width="469.8pt">        SUP top AUXILIARY MUST dc )</text>
</p>

These currently post-process to inline-block spans, as in

<span class="ltx_text ltx_font_typewriter ltx_inline-block" style="width:469.8pt;">

But it sounds like they should be ltx_block instead? That would enforce the line-breaks.