sphinx-doc / sphinx

The Sphinx documentation generator
https://www.sphinx-doc.org/
Other
6.43k stars 2.1k forks source link

code-block overflow in latex #2873

Closed andreacassioli closed 8 years ago

andreacassioli commented 8 years ago

I have found a case in which a long line of code goes out of the code block box. The line is actually wrongly broken in two. See the following snapshot.

screenshot from 2016-08-19 14 57 04

shibukawa commented 8 years ago

Word wrapping in code-block is difficult for Sphinx. Some languages have different semantics if Sphinx breaks lines automatically (e.g. Python, yaml etc). Sphinx doesn't know meaning in code-blocks and where line breaks should be.

andreacassioli commented 8 years ago

Well, usually it is ok, and I have seen a dramatic improvement over the last year.

In this case I do not understand way the line is split in three. The first split is useless. However, it seems that adding additional blank spaces after each comma does the trick.

I am always a bit surprise Sphinx does not use some specialized Latex package like minted.

tk0miya commented 8 years ago

I think this is a bug of #2343.

About minted, please refer #2304. We are already discussed about it. And finally, we decide to continue using the verbatim package.

tk0miya commented 8 years ago

@jfbu Could you check this after coming back.

andreacassioli commented 8 years ago

OK, I see.

I missed #2304 and to be honest it is very long to follow!

Thanks for your work guys.

jfbu commented 8 years ago

The problem is due to the fact that the intended behaviour of the "comma" is overridden at some point by the code of fancyhdr.sty. Each time it processes a code line it executes a latex kernel macro called \do@noligs on each one of `<>,'- which make these characters self-inserting with no breaking of line possibility. The code from #2343 which handles especially . , ; ? ! / (only the comma is common) is executed before and thus in the case of the comma it gets lost. A quick fix is to insert \let\do@noligs\@gobble at the very end of the definition of \sphinxbreaksatpunct in sphinx.sty. It seems brutal but may be logical because after all \do@noligs is designed by LaTeX authors in particular to make breaking the line impossible (that's the goal of original verbatim code). But some thinking is needed (I need to warm up again on LaTeX which I have a bit forgotten these last few weeks...) (perhaps it is better to modify \verbatim@nolig@list by removing the comma from it: insert \def\verbatim@nolig@list {\do \\do \<\do >\do \'\do -}%as last line before closing brace of\sphinxbreaksatpunct` macro.)

% This macro makes them "active" and they will insert potential linebreaks
\newcommand*\sphinxbreaksatpunct {%
   \lccode`\~`\.\lowercase{\def~}{\discretionary{\char`\.}{\sphinxafterbreak}{\char`\.}}%
   \lccode`\~`\,\lowercase{\def~}{\discretionary{\char`\,}{\sphinxafterbreak}{\char`\,}}%
   \lccode`\~`\;\lowercase{\def~}{\discretionary{\char`\;}{\sphinxafterbreak}{\char`\;}}%
   \lccode`\~`\:\lowercase{\def~}{\discretionary{\char`\:}{\sphinxafterbreak}{\char`\:}}%
   \lccode`\~`\?\lowercase{\def~}{\discretionary{\char`\?}{\sphinxafterbreak}{\char`\?}}%
   \lccode`\~`\!\lowercase{\def~}{\discretionary{\char`\!}{\sphinxafterbreak}{\char`\!}}%
   \lccode`\~`\/\lowercase{\def~}{\discretionary{\char`\/}{\sphinxafterbreak}{\char`\/}}%
   \catcode`\.\active
   \catcode`\,\active
   \catcode`\;\active
   \catcode`\:\active
   \catcode`\?\active
   \catcode`\!\active
   \catcode`\/\active
   \lccode`\~`\~
\def\verbatim@nolig@list {\do \`\do \<\do \>\do \'\do \-}% let LaTeX not fiddle with the comma
}

For some reason #2343 seems to believe that the comma is not pygmentized but when I look at the latex code produced via sphinx, I see that commas will appear as \PYG{p}{,} hence another approach could possibly be to modify \PYG macro to achieve desired effect of adding potential breakpoint. But it might still be needed to worry about the \do@noligs.

Finally, #2343 should be improved to allow user to customize whether line break is preferably before of after said character (such as comma). Currently, this is rigidly implemented inside sphinx.sty. I can't handle this until about one week from now.