cmhughes / latexindent.pl

Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customisable through its YAML interface.
GNU General Public License v3.0
864 stars 84 forks source link

Contextual `sentencesBeginWith`, semicolons, lowercase and `\paragraph` #527

Closed remisphere closed 3 months ago

remisphere commented 5 months ago

original .tex code

\paragraph{latexindent.pl}
latexindent.pl is a Perl script that indents .tex (and other) files according to an indentation
scheme that the user can modify to suit their taste. Environments, including those with alignment
delimiters (such as tabular), and commands, including those that can split braces and
brackets across lines, are usually handled correctly by the script. Options for verbatim-like environments
and commands, together with indentation after headings (such as chapter, section,
etc) are also available. The script also has the ability to modify line breaks, and to add comment
symbols and blank lines; furthermore, it permits string or regex-based substitutions. All user
options are customisable via the switches and the YAML interface.

yaml settings

modifyLineBreaks:
  oneSentencePerLine:
    manipulateSentences: 1
    removeSentenceLineBreaks: 1
    textWrapSentences: 1
    sentencesFollow:
      par: 1
      blankLine: 1
      fullStop: 1
      exclamationMark: 1
      questionMark: 1
      rightBrace: 1
      commentOnPreviousLine: 1
      other: ;
    sentencesBeginWith:
      A-Z: 1
      a-z: 1
      other: 0
    sentencesEndWith:
      basicFullStop: 0
      betterFullStop: 1
      exclamationMark: 1
      questionMark: 1
      other: ;

actual/given output

\par
agraph{latexindent.pl} latexindent.pl is a Perl script that indents .tex (and other) files according to an indentation scheme that the user can modify to suit their taste.
Environments, including those with alignment delimiters (such as tabular), and commands, including those that can split braces and brackets across lines, are usually handled correctly by the script.
Options for verbatim-like environments and commands, together with indentation after headings (such as chapter, section, etc) are also available.
The script also has the ability to modify line breaks, and to add comment symbols and blank lines;
furthermore, it permits string or regex-based substitutions.
All user options are customisable via the switches and the YAML interface.

desired output

\paragraph{latexindent.pl}
latexindent.pl is a Perl script that indents .tex (and other) files according to an indentation scheme that the user can modify to suit their taste.
Environments, including those with alignment delimiters (such as tabular), and commands, including those that can split braces and brackets across lines, are usually handled correctly by the script.
Options for verbatim-like environments and commands, together with indentation after headings (such as chapter, section, etc) are also available.
The script also has the ability to modify line breaks, and to add comment symbols and blank lines;
furthermore, it permits string or regex-based substitutions.
All user options are customisable via the switches and the YAML interface.

My point

Hello, I would like to break sentences at semicolons, this would ideally require looking for sentences starting lowercase only following a semicolon. I don't think it is currently possible, I would like to ask for a way to configure latexindent to have different sentencesBeginWith rules depending on sentencesFollow cases, in the same spirit as what is done for poly-switches, where one can specify different values for each environment. For now, if I allow sentences to begin lowercase, I encounter this awkward situation if the par option is also turned on, where \paragraph{} commands are split. I suspect there may be a before/after option to detect commands before sentences that I didn't find in the docs, if not, that might also be something to add. Thanks for the tool !

cmhughes commented 5 months ago

Apologies for the delay I'm hoping to get to this soon

cmhughes commented 3 months ago

Thanks for this and apologies for the delay.

The \par issue is a bug, and I've fixed it as of https://github.com/cmhughes/latexindent.pl/commit/41614554264caf35803838eaf4325051d51bbd54

So, now if we use

modifyLineBreaks:
  oneSentencePerLine:
    manipulateSentences: 1
    removeSentenceLineBreaks: 1
    textWrapSentences: 1
    sentencesFollow:
      par: 1
      blankLine: 1
      fullStop: 1
      exclamationMark: 1
      questionMark: 1
      rightBrace: 1
      commentOnPreviousLine: 1
      other: ;
    sentencesBeginWith:
      A-Z: 1
      a-z: 1
    sentencesDoNOTcontain:
      other: \}                    # regex
    sentencesEndWith:
      basicFullStop: 0
      betterFullStop: 1
      exclamationMark: 1
      questionMark: 1
      other: ;

then we receive your desired output

\paragraph{latexindent.pl}
latexindent.pl is a Perl script that indents .tex (and other) files according to an indentation scheme that the user can modify to suit their taste.
Environments, including those with alignment delimiters (such as tabular), and commands, including those that can split braces and brackets across lines, are usually handled correctly by the script.
Options for verbatim-like environments and commands, together with indentation after headings (such as chapter, section, etc) are also available.
The script also has the ability to modify line breaks, and to add comment symbols and blank lines;
furthermore, it permits string or regex-based substitutions.
All user options are customisable via the switches and the YAML interface.

I'll get this released.