cmhughes / latexindent.pl

Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customisable through its YAML interface.
GNU General Public License v3.0
891 stars 84 forks source link

Improving text wrap #103

Closed zoehneto closed 6 years ago

zoehneto commented 6 years ago

If I format a file where lines are mostly longer than the limit I specify with modifyLineBreaks:textWrapOptions:columns:80 I end up with many lines cut to 80 characters each followed by a line with about 20 to 30 characters. This hinders readability and makes the formatting quite ugly. Would it be possible to allow combining the textWrap with modifyLineBreaks:removeParagraphLineBreaks by first executing removeParagraphLineBreaks and then cutting the result into lines of for example 80 characters which would lead to a more even text block?

cmhughes commented 6 years ago

Many thanks for this.

To help me understand this completely, could you provide the following:

Once you've provided these, I'll be better able to assess :)

zoehneto commented 6 years ago

Lets say I have the following text (length of shortest line is 96 characters):

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.

I then format with latexindent test.tex -m -y="modifyLineBreaks:textWrapOptions:columns:80"

This results in the following, unbalanced, text:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida
finibus turpis.
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices
consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt
vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut
sollicitudin nulla ultricies et.

My Goal would be something like this:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis. Vestibulum vehicula massa libero. Aliquam
vulputate, metus sed ultrices consequat, nulla ligula congue tortor, nec 
dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, 
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

I tried to use latexindent test.tex -m -y="modifyLineBreaks:removeParagraphLineBreaks:all:1, modifyLineBreaks:textWrapOptions:columns:80" (the idea being it would first remove the paragraph line breaks and then cut the lines to 80 characters) but it just ignored the text wrap setting.

zoehneto commented 6 years ago

I can basically get the desired result today by first formatting with latexindent test.tex -m -y="modifyLineBreaks:removeParagraphLineBreaks:all:1" and then formatting in a second pass with latexindent test.tex -m -y="modifyLineBreaks:textWrapOptions:columns:80" but it would be preferable to be able to set both settings and only format once.

cmhughes commented 6 years ago

I'm glad you've got a solution. I'm happy to explore this, but I can't guarantee that it'll possible. I'm envisaging a per-code-block set of switches for textWrap, analogous to the exisiting removeParagraphLineBreaks, with some kind of priority setting to instruct latexindent.pl in which order to approach them. Is that along the lines you're thinking?

zoehneto commented 6 years ago

For my use case at least I wouldn't need per-code-block settings, it should be enough to change the processing order in Document.pm to first execute find_objects in process_body_of_text and then max_char_per_line. I tried making the changes myself but it lead to a deadlock on the maxLineChars tests and I don't have the perl knowledge to debug it.

cmhughes commented 6 years ago

removeParagraphLineBreaks operates on a per code block basis, and can't be changed. textWrap operates before most code blocks have been found. If you want to change the ordering of these things, it'll be necessary to change textWrap to be done on a per-code-block basis. As I say, I'm happy to explore this.

On Sun, Feb 11, 2018 at 11:05 AM, Tom Zöhner notifications@github.com wrote:

For my use case at least I wouldn't need per-code-block settings, it should be enough to change the processing order in Document.pm to first execute find_objects in process_body_of_text and then max_char_per_line. I tried making the changes myself but it lead to a deadlock on the maxLineChars tests and I don't have the perl knowledge to debug it.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-364742918, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYCsih4qYFec-V_oj3esPNSmrpLerks5tTsl1gaJpZM4R4sUB .

zoehneto commented 6 years ago

Ok, in that case, thanks for looking into it :)

I'm not sure whether such a feature would actually make sense but would converting textWrap to a per block switch mean that one could set different line length for different block types (just like removeParagraphLineBreaks can have different settings for different block types)?

cmhughes commented 6 years ago

That could be a possibility :) I'll update this thread with progress, although I've got no idea when that might be.

On Sun, Feb 11, 2018 at 12:20 PM, Tom Zöhner notifications@github.com wrote:

Ok, in that case, thanks for looking into it :)

I'm not sure whether such a feature would actually make sense but would converting textWrap to a per block switch mean that one could set different line length for different block types (just like removeParagraphLineBreaks can have different settings for different block types)?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-364747443, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYHKJ2M3iTvLCJ_WGSzTdGuw8h9-1ks5tTtsBgaJpZM4R4sUB .

dzklaim commented 6 years ago

thank you for your work. I am very interested in this. right now I'm using the same method that zoehneto described. By the way, is it possible to disable the scripts inside math delimiters like \[, \]? The formulas and expressions that I type regularly are impossible to align properly. I would like to disable the script completely on this blocks (no wrapping, align, remove paragraph)

cmhughes commented 6 years ago

Could you tell latexindent to treat them as verbatim blocks? Or wrap them in a noindent block?

On Tuesday, April 3, 2018, dzklaim notifications@github.com wrote:

thank you for your work. I am very interested in this. right now I'm using the same method that zoehneto described. By the way, is it possible to disable the scripts inside math delimiters like [, ]? The formulas and expressions that I type regularly are impossible to align properly. I would like to disable the script completely on this blocks (no wrapping, align, remove paragraph)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-378365456, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYOgtENkYeWCnlCLVZ3-bYu-ce8qMks5tk8sDgaJpZM4R4sUB .

dzklaim commented 6 years ago

how do I add \[ and \] to the verbatim commands? I tried i to add specialBeginEnd: 1 to the verbatim commands and verbatim environments but it did not worked. right now I'm using sed to comment what I want to be ignored

cmhughes commented 6 years ago

I don't think that's possible at the moment. See noIndentBlock in http://latexindentpl.readthedocs.io/en/latest/sec-default-user-local.html

On Wednesday, April 4, 2018, dzklaim notifications@github.com wrote:

how do I add [ and ] to the verbatim commands? I tried i to add specialBeginEnd: 1 to the verbatim commands and verbatim environments but it did not worked. right now I'm using sed to comment what I want to be ignored

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-378751035, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYC5sWie3YmaJHzx-TpGXPJOZq8pxks5tlTrpgaJpZM4R4sUB .

cmhughes commented 6 years ago

@dzklaim if you'd like to open a new issue for your question, feel free!

dbmrq commented 6 years ago

I just want to point out an edge case that @zoehneto's double pass approach doesn't fix.

If you start with this file:

mwe.tex

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita
kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
\begin{itemize}
  \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
    minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
    ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
  \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
    minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
    ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
\end{itemize}
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita
kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

And run the two passes he mentioned here:

latexindent mwe -o=+-mod1.tex -m -y="modifyLineBreaks:removeParagraphLineBreaks:all:1"
latexindent mwe-mod1.tex -o=mwe-mod2.tex -m -y="modifyLineBreaks:textWrapOptions:columns:80"

You'll still end up with text that is longer than the columns value:

mwe-mod2.tex

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
          eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
          veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
          consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
          cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
          proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
    \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
          eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
          veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo
          consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse
          cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non
          proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
\end{itemize}
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

That's expected, since, as it says in the docs:

indentation is performed after the text wrapping routine; as such, indented code will likely exceed any maximum value set in the columns field

But then my thinking is to run the second pass again, so the lines will be wrapped one more time after the indentation is applied:

latexindent mwe-mod2.tex -o=mwe-mod3.tex -m -y="modifyLineBreaks:textWrapOptions:columns:80"

But that brings us back to the problem that started this issue:

mwe-mod3.tex

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
          eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
          enim ad minim
          veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
          ex ea commodo
          consequat. Duis aute irure dolor in reprehenderit in voluptate
          velit esse
          cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
          cupidatat non
          proident, sunt in culpa qui officia deserunt mollit anim id est
          laborum.
    \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
          eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut
          enim ad minim
          veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip
          ex ea commodo
          consequat. Duis aute irure dolor in reprehenderit in voluptate
          velit esse
          cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
          cupidatat non
          proident, sunt in culpa qui officia deserunt mollit anim id est
          laborum.
\end{itemize}
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod
tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At
vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd
gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

I don't know if the per-code-block approach would also allow changing the order of wrapping and indentation, or if there's any other way to deal with this, but I thought I'd point it out as something worth considering.

dbmrq commented 6 years ago

By the way, I don't know any Perl, but this might be of interest, instead of all the multiple passes:

http://search.cpan.org/~mward/Text-Reflow-1.09/Reflow.pm

cmhughes commented 6 years ago

Thanks for this, it's helpful to have these kinds of test cases.

On Fri, Apr 27, 2018 at 9:24 PM, Daniel Marques notifications@github.com wrote:

By the way, I don't know any Perl, but this might be of interest, instead of all the multiple passes:

http://search.cpan.org/~mward/Text-Reflow-1.09/Reflow.pm

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-385084494, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYHvH4Y-MIulcuqIg_CteRdO7ieD5ks5ts36FgaJpZM4R4sUB .

Krzmbrzl commented 6 years ago

Really hyped for the text-wrapping on per-code-basis here :+1:

cmhughes commented 6 years ago

Hi everyone, Development of this feature is taking place on https://github.com/cmhughes/latexindent.pl/tree/feature/textwrap-improvement If you've commented on this issue, please can you grab a copy of this branch.

In particular, you'll note that within test-cases/maxLineChars you'll find

zoehneto.tex

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.

and

zoehneto1.yaml

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1

Upon running the following command,

latexindent.pl -m zoehneto -l=zoehneto1.yaml -o=+-mod1

then you obtain the following output

zoehneto-mod1.tex

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

A lot more options are possible; for example, you can specify columns on a per-code-block basis; for example, with the following settings:

zoehneto4.yaml

modifyLineBreaks:
    textWrapOptions:
        columns: 
            default: 80
            environments: 40
        perCodeBlockBasis: 1
        environments: 
            another: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1

then columns will be wrapped at the 40th character for environment objects named another. So, with the above settings, and the following file

zoehneto2.tex

\begin{something}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.
\end{something}
\begin{another}
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.
\end{another}

and running

latexindent.pl -m  zoehneto2.tex -l=zoehneto4.yaml -o=+-mod4

then the following output is obtained:

\begin{something}
    Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.
\end{something}
\begin{another}
    Lorem ipsum dolor sit amet, consectetur
    adipiscing elit. Donec ac elit magna.
    Vivamus blandit turpis in erat
    malesuada lacinia. Maecenas nisi sem,
    vestibulum ac arcu quis, gravida
    finibus turpis.  Vestibulum vehicula
    massa libero. Aliquam vulputate, metus
    sed ultrices consequat, nulla ligula
    congue tortor, nec dapibus erat tortor
    in erat. Nam accumsan, mauris vel
    tincidunt vulputate, tellus velit
    pharetra dui, a malesuada turpis lectus
    vel nunc. Donec suscipit velit mauris,
    ut sollicitudin nulla ultricies et.
\end{another}

Please clone the above branch, and test it. Please let me know if this is consistent with what is hoped from this feature request.

Many thanks, Chris

zoehneto commented 6 years ago

Thanks for working on this :)

I have a couple issues with the current implementation:

cmhughes commented 6 years ago

@zoehneto thanks for the follow-up, and for your time in testing this.

Can you provide a small .tex file that demonstrates the problems you describe? It would be great if you could provide:

Let me know, and I'll continue to work on this :)

cmhughes commented 6 years ago

@dbmrq regarding your test case, as of https://github.com/cmhughes/latexindent.pl/commit/d19a3858fb9feb7abd5eb7ed5b25bb222ec6ca8e you should see that if you start with your original file:

dbmrq.tex

Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita
kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
\begin{itemize}
  \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
    minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
    ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
  \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
    minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
    ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
  \item Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do
    eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad
    minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex
    ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
\end{itemize}
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita
kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.

and the YAML file

dbmrq1.yaml

modifyLineBreaks:
    textWrapOptions:
        columns: 
            items: 20
            masterDocument: 40
        perCodeBlockBasis: 1
        items: 1
        masterDocument: 1

and then run

latexindent.pl -m -l=dbmrq1.yaml dbmrq.tex

then you should receive the output below

dbmrq-mod1.tex

Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et
justo duo dolores et ea rebum. Stet
clita
kasd gubergren, no sea takimata sanctus
est Lorem ipsum dolor sit amet.
\begin{itemize}
    \item Lorem ipsum dolor
          sit amet,
          consectetur
          adipisicing elit,
          sed do
          eiusmod tempor
          incididunt ut
          labore et dolore
          magna aliqua. Ut
          enim ad
          minim veniam, quis
          nostrud
          exercitation
          ullamco laboris
          nisi ut aliquip ex
          ea commodo
          consequat. Duis
          aute irure dolor in
          reprehenderit in
          voluptate
          velit esse cillum
          dolore eu fugiat
          nulla pariatur.
          Excepteur sint
          occaecat
          cupidatat non
          proident, sunt in
          culpa qui officia
          deserunt mollit
          anim id
          est laborum.
    \item Lorem ipsum dolor
          sit amet,
          consectetur
          adipisicing elit,
          sed do
          eiusmod tempor
          incididunt ut
          labore et dolore
          magna aliqua. Ut
          enim ad
          minim veniam, quis
          nostrud
          exercitation
          ullamco laboris
          nisi ut aliquip ex
          ea commodo
          consequat. Duis
          aute irure dolor in
          reprehenderit in
          voluptate
          velit esse cillum
          dolore eu fugiat
          nulla pariatur.
          Excepteur sint
          occaecat
          cupidatat non
          proident, sunt in
          culpa qui officia
          deserunt mollit
          anim id
          est laborum.
    \item Lorem ipsum dolor
          sit amet,
          consectetur
          adipisicing elit,
          sed do
          eiusmod tempor
          incididunt ut
          labore et dolore
          magna aliqua. Ut
          enim ad
          minim veniam, quis
          nostrud
          exercitation
          ullamco laboris
          nisi ut aliquip ex
          ea commodo
          consequat. Duis
          aute irure dolor in
          reprehenderit in
          voluptate
          velit esse cillum
          dolore eu fugiat
          nulla pariatur.
          Excepteur sint
          occaecat
          cupidatat non
          proident, sunt in
          culpa qui officia
          deserunt mollit
          anim id
          est laborum.
\end{itemize}
Lorem ipsum dolor sit amet, consetetur
sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et
dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et
justo duo dolores et ea rebum. Stet
clita
kasd gubergren, no sea takimata sanctus
est Lorem ipsum dolor sit amet.

Note, in particular, that I've specified columns on a per-code-block basis.

Is this consistent with what you would like?

zoehneto commented 6 years ago

Test case

A simple example:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
\item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis
in erat malesuada lacinia. 
\item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue
tortor, nec dapibus erat tortor in erat. 
\item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui,
a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.
\end{itemize} 

My goal is to get the following output:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
            Vivamus blandit turpis in erat malesuada lacinia.  
    \item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis. 
        Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices
        consequat, nulla ligula congue tortor, nec dapibus   erat tortor in erat.  
    \item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui
        , a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut
        sollicitudin nulla ultricies et.
\end{itemize}

Simple configuration

Yaml:

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1  

Result (note that not only is the formatting for itemize wrong and doesn't adhere to the configured lenght, but it also mixes tabs and spaces in the indentation):

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
          Vivamus blandit turpis in erat malesuada lacinia.  \item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum
          vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla
          ligula congue tortor, nec dapibus erat tortor in erat.    \item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a
          malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin
          nulla ultricies et.
\end{itemize}

Complex configuration

Yaml:

 modifyLineBreaks:
    textWrapOptions:
        columns: 
            default: 80
            environments: 80
        perCodeBlockBasis: 1
        environments: 
            itemize: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1

Result (it seams that now only removeParagraphLineBreaks is applied):

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor, nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna. Vivamus blandit turpis in erat malesuada lacinia.  \item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor, nec dapibus erat tortor in erat.  \item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin nulla ultricies et.
\end{itemize}

As a side note: if I understand the new configuration correctly, it only allows selecting which environments should be wrapped and specifying one line length for all of them. Would it be possible to make this configurable per environment (e.g. I might want itemize to be the same width as normal text but at the same time tables should be bigger so I can avoid line wrapping)?

cmhughes commented 6 years ago

@zoehneto thanks very much for the follow-up, and for providing such a clear test case, it's very helpful.

Starting with your initial file (titled Test Case) and using the following yaml

zoehneto-config1.yaml

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        items: 1
        masterDocument: 1
    removeParagraphLineBreaks:
        items: 1
        masterDocument: 1
        beforeTextWrap: 1  

then I receive the following ouput

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
          Vivamus blandit turpis in erat malesuada lacinia.
    \item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum
          vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla
          ligula congue tortor, nec dapibus erat tortor in erat.
    \item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a
          malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin
          nulla ultricies et.
\end{itemize}

Is this as you would like?

To your follow-up question:

if I understand the new configuration correctly, it only allows selecting which environments should be wrapped and specifying one line length for all of them.

Yes, that is currently correct.

Would it be possible to make this configurable per environment (e.g. I might want itemize to be the same width as normal text but at the same time tables should be bigger so I can avoid line wrapping)?

I think this should be possible. How would the interface look? Would it be, for example, as follows:

modifyLineBreaks:
    textWrapOptions:
        columns: 
            default: 80
            environments:
                  default: 80
                  itemize: 40
                  tabular: 100
        perCodeBlockBasis: 1
        environments: 
            itemize: 1
zoehneto commented 6 years ago

This is incorrect, see other comment


After some experimentation it turns out I just have to remove the all: 1 from removeParagraphLineBreaks (the all: 1 in textWrapOptions is necessary or it won't work in real world documents):

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        beforeTextWrap: 1

With this configuration I can also format real documents and the wrapping is correctly applied (except for tabu tables).

Regarding the environment configuration: your example looks great, that's what I had in mind :)

cmhughes commented 6 years ago

@zoehneto if you can pull from the above branch, you should see that as of https://github.com/cmhughes/latexindent.pl/commit/89d9e0b04cb9ca7aed9d37d042cf89db2335c33c you can now specify the number of columns on a per-name basis.

For example,

modifyLineBreaks:
    textWrapOptions:
        columns: 
            default: 10
            environments:
                default: 10000
                another: 30
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1

The priority ordering for the checking of the value for columns is as follows:

How does this look?

zoehneto commented 6 years ago

It currently doesn't work for me using the latest commit with the following yaml:

modifyLineBreaks:
    textWrapOptions:
        columns: 
            default: 80
            environments:
                default: 40
                itemize: 40
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        beforeTextWrap: 1

It always uses the default, non environment value:

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
          Vivamus blandit turpis in erat malesuada lacinia.  \item
          Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum
          vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla
          ligula congue tortor, nec dapibus erat tortor in erat.    \item
          Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a
          malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin
          nulla ultricies et.
\end{itemize}
zoehneto commented 6 years ago

I also had another look at the simple scenario and it turns out my solution was obviously wrong, because it doesn't remove any line breaks. The best I can come up with (and that is still a hack), is to enable removeParagraphLineBreaks for everything but the itemize environment. Since currently there is no 'all but one' configuration option, you end up with something like this:

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        alignAtAmpersandTakesPriority: 1
        environments:
            itemize: 0
            enumerate: 0
            abstract: 1
            table: 1
            figure: 1
            ...
            every other latex environment known to man
            ...
        ifElseFi: 1
        optionalArguments: 1
        mandatoryArguments: 1
        items: 1
        specialBeginEnd: 1
        afterHeading: 1
        filecontents: 1
        masterDocument: 1
        beforeTextWrap: 1
dbmrq commented 6 years ago

@cmhughes As for my example, this is definitely a huge improvement. Ideally I wouldn't have to specify the width for the items, the whole text would be limited to the masterDocument value regardless. So if masterDocument is 40 and an item begins at column 10, its width should be 30, but if it begins at column 25, the width should be only 15. Usually the items all begin on the same column though, so it's no big deal to have the width hard coded like this, you don't have to change it on my account. I'm only mentioning it as a suggestion in case you're interested. Thanks!

cmhughes commented 6 years ago

@zoehneto thanks for this.

Since currently there is no 'all but one' configuration option

This makes me wonder... let's say that you start with something like the following:

removeParagraphLineBreaks:
        all: 1
        environments: 
            quotation: 0

The current behaviour would be to ignore any of the specific settings, because all: 1.

Would you like the behaviour the script to be changed so that, even if all: 1 then if it finds an 'off' switch such as quotation: 0 then it turns it off for this particular <thing> or <per-name thing>?

zoehneto commented 6 years ago

@cmhughes exactly, because then I could get my desired behavior like this:

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        environments:
            itemize: 0
        beforeTextWrap: 1

Now obviously this is just a work around, because it means that I don't get reflow for itemize while my current work around with two formatting passes does reflow for itemize while also maintaining nice \item formatting (though it does need multiple runs to achieve this result at times as well...), but if everything else works I could live with that.

cmhughes commented 6 years ago

Ok, many thanks, that's clear and gives me a clear way forward. I'll post back when I have progress. delays at my end are because of manic business at work, my apologies.

On Sat, 14 Jul 2018, 12:18 Tom Zöhner, notifications@github.com wrote:

@cmhughes https://github.com/cmhughes exactly, because then I could get my desired behavior like this:

modifyLineBreaks: textWrapOptions: columns: 80 perCodeBlockBasis: 1 all: 1 removeParagraphLineBreaks: all: 1 environments: itemize: 0 beforeTextWrap: 1

Now obviously this is just a work around, because it means that I don't get reflow for itemize while my current work around with two formatting passes does reflow for itemize while also maintaining nice \item formatting (though it does need multiple runs to achieve this result at times as well...), but if everything else works I could live with that.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-405016599, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYIiG9IoAzyA-Z67smG2ahd-8pA0Lks5uGdN9gaJpZM4R4sUB .

cmhughes commented 6 years ago

@zoehneto as of https://github.com/cmhughes/latexindent.pl/commit/ab136adc1529806abf8ab0c910879b4142ed0ef2 both the textWrapOptions and removeParagraphLineBreaks allow exceptionsToAll field.

This means that, for example, starting with

zoehneto17.yaml

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        exceptionsToAll:
            environments:
                itemize: 0
        beforeTextWrap: 1

then the output from the command

latexindent.pl -m -l=zoehneto17.yaml zoehneto3.tex

gives

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
Vivamus blandit turpis in erat malesuada lacinia. Maecenas nisi sem, vestibulum
ac arcu quis, gravida finibus turpis.  Vestibulum vehicula massa libero.
Aliquam vulputate, metus sed ultrices consequat, nulla ligula congue tortor,
nec dapibus erat tortor in erat. Nam accumsan, mauris vel tincidunt vulputate,
tellus velit pharetra dui, a malesuada turpis lectus vel nunc. Donec suscipit
velit mauris, ut sollicitudin nulla ultricies et.

\begin{itemize}
    \item Lorem ipsum dolor sit amet, consectetur adipiscing elit. Donec ac elit magna.
          Vivamus blandit turpis in erat malesuada lacinia.
    \item Maecenas nisi sem, vestibulum ac arcu quis, gravida finibus turpis.  Vestibulum
          vehicula massa libero. Aliquam vulputate, metus sed ultrices consequat, nulla
          ligula congue tortor, nec dapibus erat tortor in erat.
    \item Nam accumsan, mauris vel tincidunt vulputate, tellus velit pharetra dui, a
          malesuada turpis lectus vel nunc. Donec suscipit velit mauris, ut sollicitudin
          nulla ultricies et.
\end{itemize}

For reference, you can also specify exceptionsToAll on a per-code-block type, e.g.

modifyLineBreaks:
    removeParagraphLineBreaks:
        all: 1
        exceptionsToAll:
            environments: 0
        beforeTextWrap: 1

in which case, all code blocks except for environments will receive the removeParagraphLineBreaks routine.

How does this look?

cmhughes commented 6 years ago

I've been thinking about the interface to this a lot. Perhaps the following would be better

modifyLineBreaks:
    removeParagraphLineBreaks:
        all: 
            except:
                 environments: 1
        beforeTextWrap: 
zoehneto commented 6 years ago

The commit works well, I now have the following yaml for the simple case:

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all: 1
        exceptionsToAll:
            environments:
                itemize: 0
        beforeTextWrap: 1

I'm not sure about the interface. On one hand your second example seems nicer from a semantic perspective, on the other hand it adds one more level of nesting for something which is already quite deeply nested.

cmhughes commented 6 years ago

That's great, many thanks.

It sounds like this improvement now works as desired. I've got some more testing to do, and the interface question needs investigating. Your point about nesting is well taken.

I'll continue to work on these points. In the mean time, do let me know if you have any other test cases.

On Mon, 23 Jul 2018, 07:45 Tom Zöhner, notifications@github.com wrote:

The commit works well, I now have the following yaml for the simple case:

modifyLineBreaks: textWrapOptions: columns: 80 perCodeBlockBasis: 1 all: 1 removeParagraphLineBreaks: all: 1 exceptionsToAll: environments: itemize: 0 beforeTextWrap: 1

I'm not sure about the interface. On one hand your second example seems nicer from a semantic perspective, on the other hand it adds one more level of nesting for something which is already quite deeply nested.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-406956431, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYBnFiI3UfKVKES6HRTRcd_i0wodoks5uJXENgaJpZM4R4sUB .

cmhughes commented 6 years ago

important

I need a way to make the preamble exempt from text wrapping. Can't remember if this is already a thing. Noting it here for my reference.

cmhughes commented 6 years ago

@zoehneto as of https://github.com/cmhughes/latexindent.pl/commit/799589af51bdcb1ed6d234b5b671cffe5a6511c6 I've tweaked the interface so that, with reference to your most recent YAML, you would use

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all:
          except:
            - 'itemize'
        beforeTextWrap: 1

The except field can take either the per-name basis of the <thing> or otherwise the type of <thing>, for example

modifyLineBreaks:
    textWrapOptions:
        columns: 80
        perCodeBlockBasis: 1
        all: 1
    removeParagraphLineBreaks:
        all:
          except:
            - 'environments'
        beforeTextWrap: 1

This seems a lot easier to follow than my previous notation (I was consistently confused about what 0 or 1 meant previously).

Can you take this for another test run? I'll continue to test, and get it documented, but I think this is pretty much ready to get merged, and in turn, will be part of V3.5 to be released soon (hopefully).

zoehneto commented 6 years ago

I just tested the latest version and it seems that everything works as intended.

cmhughes commented 6 years ago

Great many thanks. I'll keep this thread updated.

On Sun, 29 Jul 2018, 10:55 Tom Zöhner, notifications@github.com wrote:

I just tested the latest version and it seems that everything works as intended.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/103#issuecomment-408665127, or mute the thread https://github.com/notifications/unsubscribe-auth/ACHxYPL3L0RME1B2eOS09Fbs5r_2ixAUks5uLYaIgaJpZM4R4sUB .

cmhughes commented 6 years ago

For my notes, documentation must include:

cmhughes commented 6 years ago

As of https://github.com/cmhughes/latexindent.pl/pull/126, this is part of the develop branch and will be part of the next release, V3.5.

I want to see if I can resolve any more issues (particularly https://github.com/cmhughes/latexindent.pl/issues/125) before releasing.

@dbmrq your point makes sense, but it's not possible for the general case at this point because of the way that the script finds code blocks. The reason is actually exactly the same that causes https://github.com/cmhughes/latexindent.pl/issues/85. I spend a lot of time thinking about this, but haven't so far been able to find a way round it. Please understand that I'm not dismissing your proposal, it's very sensible, but at this point I can't see a way to implement it. I will continue to study it -- perhaps work upon https://github.com/cmhughes/latexindent.pl/issues/85 will help.

@zoehneto Thanks again for your help in testing and shaping this improvement, I have credited you in the documentation.

dbmrq commented 6 years ago

@cmhughes No problem, latexindent is already great. Thank you for looking into this. :)

cmhughes commented 6 years ago

Resolved as of https://github.com/cmhughes/latexindent.pl/pull/127 upload to ctan imminently.