cmhughes / latexindent.pl

Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customisable through its YAML interface.
GNU General Public License v3.0
884 stars 84 forks source link

textWrap: overhaul #346

Closed cmhughes closed 2 years ago

cmhughes commented 2 years ago

background

The textWrap routine is a jumbled mess; it attempts to do its work both before the code blocks have been found, and then during the finding of code blocks. It's a mess, and there are issues associated with it (https://github.com/cmhughes/latexindent.pl/issues/337, https://github.com/cmhughes/latexindent.pl/issues/341, https://github.com/cmhughes/latexindent.pl/issues/344).

good things about the oneSentencePerLine routine

The oneSentencePerLine does its job well, and relatively simply. It does its work after verbatim code blocks have been found, but before finding other code blocks. I think this needs to be the approach for text wrapping.

possible interface

The current YAML interface needs overhauling; I think something like the following would be a good starting point:

textWrapOptions:
        columns: 0
        huge: overflow    # forbid mid-word line breaks
        separator: ""
        multipleSpacesToSingle: 1            
        blocksFollow:
           par: 1
           blankLine: 1
           verbatim: 1
           commentOnPreviousLine: 1
           other: 0                           # regex
        blocksBeginWith:
           A-Z: 1
           a-z: 0
           0-9: 0
           other: 0                           # regex
        blocksEndBefore:
           verbatim: 1
           commentOnOwnLine: 1
           other: '\\begin\{|\\\[|\\end\{'    # regex

This would remove the need for removeParagraphLineBreaks, and would not happen on a per-code-block basis. It would be done immediately before or after the oneSentencePerLine routine.

testing

In addition to the 1000s of test cases that I use before each commit, I need to ensure that https://github.com/cmhughes/latexindent.pl/issues/337, https://github.com/cmhughes/latexindent.pl/issues/341, https://github.com/cmhughes/latexindent.pl/issues/344 are resolved.

time frame

Development and testing is slow at the moment. I'm not sure how or when I'll get to this, but the above ideas are helpful to me.

cmhughes commented 2 years ago

Json scheme to be updated

cmhughes commented 2 years ago

For anyone following this issue, I am progressing it. It's a massive piece of work, so is taking a long time.

Commits should start to flow... Soon, hopefully.

I'm looking forward to text wrapping being more straight forward and robust.

cmhughes commented 2 years ago
        multipleSpacesToSingle: 1            

for one sentence per line as well.

cmhughes commented 2 years ago

As of https://github.com/cmhughes/latexindent.pl/commit/462ed6d5a710f14b9108494df43fda37cfda86ac this over haul is implemented and fully documented.

new interface

modifyLineBreaks:
    textWrapOptions:
        columns: 0
        multipleSpacesToSingle: 1            
        blocksFollow:
           headings: 1
           commentOnPreviousLine: 1
           par: 1
           blankLine: 1
           verbatim: 1
           filecontents: 1
           other: '\\\]|\\item(?:\h|\[)'      # regex
        blocksBeginWith:
           A-Z: 1
           a-z: 1
           0-9: 0
           other: 0                           # regex
        blocksEndBefore:
           commentOnOwnLine: 1
           verbatim: 1
           filecontents: 1
           other: '\\begin\{|\\\[|\\end\{'    # regex
        huge: overflow                        # forbid mid-word line breaks
        separator: ""

see documentation for full details and examples.

cmhughes commented 2 years ago

Implemented as of https://github.com/cmhughes/latexindent.pl/releases/tag/V3.16