cmhughes / latexindent.pl

Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customisable through its YAML interface.
GNU General Public License v3.0
864 stars 84 forks source link

Substitution cannot proceed if "this" field, "that" field, and the text all contain "\cite" #503

Closed saxyx closed 7 months ago

saxyx commented 8 months ago

44.tex code

\cite

localSettings.yaml

replacements:   - this: \cite that: ref \cite

actual/given output

E:\Personal\Desktop\rr>latexindent -w -r -sl -o=++ -l=localSettings.yaml 44.tex INFO: latexindent.pl version 3.23.4, 2023-11-19, a script to indent .tex files latexindent.pl lives here: C:/CTEX/MiKTeX/scripts/latexindent/ Tue Dec 19 17:08:09 2023 Filename: 44.tex INFO: Processing switches: -sl|--screenlog: log file will also be output to the screen -w|--overwrite: Overwrite mode active, will make a back up before overwriting -l|--localSettings: Read localSettings YAML file -o|--outputfile: output to file -r|--replacement: replacement mode INFO: Options check: -w and -o specified You have called latexindent.pl with both -o and -w The -o switch will take priority, and -w (overwrite) will be ignored INFO: Directory for backup files and indent.log: . INFO: Perl modules are being loaded from the following directories: E:/program/Strawberry/perl/lib/FindBin.pm E:/program/Strawberry/perl/vendor/lib/YAML/Tiny.pm E:/program/Strawberry/perl/lib/File/Copy.pm E:/program/Strawberry/perl/lib/File/Basename.pm E:/program/Strawberry/perl/lib/Getopt/Long.pm E:/program/Strawberry/perl/vendor/lib/File/HomeDir.pm INFO: LatexIndent perl modules are being loaded from, for example: C:/CTEX/MiKTeX/scripts/latexindent/LatexIndent/Document.pm INFO: YAML settings read: defaultSettings.yaml Reading defaultSettings.yaml from C:/CTEX/MiKTeX/scripts/latexindent/defaultSettings.yaml INFO: YAML reading settings Home directory is C:\Users\lenovo latexindent.pl didn't find indentconfig.yaml or .indentconfig.yaml see all possible locations: https://latexindentpl.readthedocs.io/en/latest/sec-appendices.html#indentconfig-options) INFO: YAML settings read: -l switch Adding ./localSettings.yaml to YAML read paths INFO: YAML settings, reading from the following files: Reading USER settings from ./localSettings.yaml

   replacements:
     -
       that: 'ref \cite'
       this: \cite

INFO: -o switch active: output file check -o switch called with file name without extension: ++ Updated to ++.tex as the file extension of the input file is tex -o switch called with file name ending with ++: ++.tex will search for existence and increment counter, starting with 440.tex 440.tex exists, incrementing counter 441.tex exists, incrementing counter 442.tex exists, incrementing counter 443.tex exists, incrementing counter 444.tex does not exist, and will be the output file

desired or expected output

ref \cite

cmhughes commented 8 months ago

Sorry for the delay, everything is a bit slow at the moment but I hope to get to this soon.

saxyx commented 8 months ago

Take your time, there's no rush. I appreciate your attention to this issue and I'll be patiently waiting for your further updates.

cmhughes commented 8 months ago

Thanks for this. Yes, at present, your example highlights a known issue.

workaround

The workaround is to use

replacements:
  -
   substitution: s/\\cite/ref \\cite/sg

for future

As of https://github.com/cmhughes/latexindent.pl/commit/27a5bdd05904117f2ea4c4b930dd465d1519741b there's no need for the workaround. This will be part of the next release, which I hope to get released soon.

cmhughes commented 7 months ago

Released at https://github.com/cmhughes/latexindent.pl/releases/tag/V3.23.5, uploaded to ctan. Thanks again.

fengzyf commented 7 months ago

Released at https://github.com/cmhughes/latexindent.pl/releases/tag/V3.23.5, uploaded to ctan. Thanks again.

Updating to latexindent 3.23.5, I found a new problem. The details are as follows.

original .tex code

XXXaaa
XaXaXXX

yaml settings

replacements:
  -
    this: 'X'
    that: 'Y'

actual/given output

YYYaaa
YaYa 

desired or expected output

YYYaaa
YaYaYYY
cmhughes commented 7 months ago

I'm unable to produce your given output; I receive the desired output:

YYYaaa
YaYaYYY
fengzyf commented 7 months ago

I'm unable to produce your given output; I receive the desired output:

YYYaaa
YaYaYYY

My operating system is Win10, the system language is Chinese. I recorded a video and the problem I was talking about actually happened.

https://github.com/cmhughes/latexindent.pl/assets/154897680/82806c86-4f9e-4eec-827c-9eee4da58779

cmhughes commented 7 months ago

Your local settings are not being loaded, as detailed in your log file. This means you are not calling latexindent correctly.

On Tue, 2 Jan 2024, 02:57 fengzyf, @.***> wrote:

I'm unable to produce your given output; I receive the desired output:

YYYaaa YaYaYYY

My operating system is Win10, the system language is Chinese. I recorded a video and the problem I was talking about actually happened.

https://github.com/cmhughes/latexindent.pl/assets/154897680/82806c86-4f9e-4eec-827c-9eee4da58779

— Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/503#issuecomment-1873583528, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ7CYF7Y2G4MALXKXRR5X3YMNZRNAVCNFSM6AAAAABA226DXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZTGU4DGNJSHA . You are receiving this because you modified the open/close state.Message ID: @.***>

cmhughes commented 7 months ago

My apologies, please ignore my previous message, I was incorrect.

cmhughes commented 7 months ago

Can you post your log file here please?

fengzyf commented 7 months ago
INFO:  latexindent.exe version 3.23.5, 2024-01-01, a script to indent .tex files
       latexindent.exe lives here: D:/rr/
       Tue Jan  2 19:33:43 2024
       Filename: 44.tex
INFO:  Processing switches:
       -s|--silent: Silent mode active (you have used either -s or --silent)
       -o|--outputfile: output to file
       -r|--replacement: replacement mode
INFO:  Directory for backup files and indent.log:
       .
INFO:  YAML settings read: defaultSettings.yaml
       Reading defaultSettings.yaml from D:/rr/defaultSettings.yaml
       Reading defaultSettings.yaml (2nd attempt) from D:/rr/../../texmf-dist/scripts/latexindent/defaultSettings.yaml
       and then, if necessary, D:/rr/LatexIndent/defaultSettings.yaml
INFO:  YAML reading settings
       Home directory is C:\Users\lenovo
       latexindent.pl didn't find indentconfig.yaml or .indentconfig.yaml
       see all possible locations: https://latexindentpl.readthedocs.io/en/latest/sec-appendices.html#indentconfig-options)
INFO:  -l switch used without filename, will search for the following files in turn:
       localSettings.yaml,latexindent.yaml,.localSettings.yaml,.latexindent.yaml
INFO:  YAML settings read: -l switch
       Multiple localSettings found, separated by commas:
       localSettings.yaml, latexindent.yaml, .localSettings.yaml, .latexindent.yaml
       Adding ./localSettings.yaml to YAML read paths
INFO:  YAML settings, reading from the following files:
       Reading USER settings from ./localSettings.yaml
       ---
       replacements:
         -
           that: Y
           this: X

INFO:  -o switch active: output file check
       -o switch called with file name without extension: ++
       Updated to ++.tex as the file extension of the input file is tex
       -o switch called with file name ending with ++: ++.tex
       will search for existence and increment counter, starting with 440.tex
       440.tex does not exist, and will be the output file
INFO:  Phase 1: searching for objects
       No objects found.
INFO:  Phase 2: finding surrounding indentation
INFO:  Phase 3: indenting objects
INFO:  Phase 4: final indentation check
INFO:  Output routine:
       Outputting to file 440.tex
       --------------
INFO:  Please direct all communication/issues to:
        https://github.com/cmhughes/latexindent.pl
cmhughes commented 7 months ago

How are you calling latexindent?

fengzyf commented 7 months ago

Run the following command in cmd. latexindent -s -l -r -o=++ 44.tex

cmhughes commented 7 months ago

Can you run

latexindent -tt -s -l -r -o=++ 44.tex

and post the log file?

fengzyf commented 7 months ago
INFO:  latexindent.exe version 3.23.5, 2024-01-01, a script to indent .tex files
       latexindent.exe lives here: D:/rr/
       Wed Jan  3 20:43:48 2024
       Filename: 44.tex
INFO:  Processing switches:
       -tt|--ttrace: TTrace mode active (you have used either -tt or --ttrace)
       -s|--silent: Silent mode active (you have used either -s or --silent)
       -o|--outputfile: output to file
       -r|--replacement: replacement mode
INFO:  Directory for backup files and indent.log:
       .
INFO:  Perl modules are being loaded from the following directories:
       E:\Personal\Temp\par-6c656e6f766f\cache-df2935a61bebb4ec4b36991156a66d318a6f0b0e\inc\lib/FindBin.pm
       E:\Personal\Temp\par-6c656e6f766f\cache-df2935a61bebb4ec4b36991156a66d318a6f0b0e\inc\lib/YAML/Tiny.pm
       /loader/HASH(0x6dc1d8)/File/Copy.pm
       /loader/HASH(0x6dc148)/File/Basename.pm
       E:\Personal\Temp\par-6c656e6f766f\cache-df2935a61bebb4ec4b36991156a66d318a6f0b0e\inc\lib/Getopt/Long.pm
       E:\Personal\Temp\par-6c656e6f766f\cache-df2935a61bebb4ec4b36991156a66d318a6f0b0e\inc\lib/File/HomeDir.pm
INFO:  LatexIndent perl modules are being loaded from, for example:
       E:\Personal\Temp\par-6c656e6f766f\cache-df2935a61bebb4ec4b36991156a66d318a6f0b0e\inc\lib/LatexIndent/Document.pm
INFO:  YAML settings read: defaultSettings.yaml
       Reading defaultSettings.yaml from D:/rr/defaultSettings.yaml
       Reading defaultSettings.yaml (2nd attempt) from D:/rr/../../texmf-dist/scripts/latexindent/defaultSettings.yaml
       and then, if necessary, D:/rr/LatexIndent/defaultSettings.yaml
INFO:  YAML reading settings
       Home directory is C:\Users\lenovo
       latexindent.pl didn't find indentconfig.yaml or .indentconfig.yaml
       see all possible locations: https://latexindentpl.readthedocs.io/en/latest/sec-appendices.html#indentconfig-options)
INFO:  -l switch used without filename, will search for the following files in turn:
       localSettings.yaml,latexindent.yaml,.localSettings.yaml,.latexindent.yaml
INFO:  YAML settings read: -l switch
       Multiple localSettings found, separated by commas:
       localSettings.yaml, latexindent.yaml, .localSettings.yaml, .latexindent.yaml
       Adding ./localSettings.yaml to YAML read paths
INFO:  YAML settings, reading from the following files:
       Reading USER settings from ./localSettings.yaml
       ---
       replacements:
         -
           that: Y
           this: X

INFO:  -o switch active: output file check
       -o switch called with file name without extension: ++
       Updated to ++.tex as the file extension of the input file is tex
       -o switch called with file name ending with ++: ++.tex
       will search for existence and increment counter, starting with 440.tex
       440.tex does not exist, and will be the output file
TRACE: Token check
TRACE: Replacement mode *before* indentation: -r
       -
       this: X
       that: Y
       when: before
TRACE: List of items: item|myitem (see itemNames)
TRACE: Constructing specialBeginEnd regex (see specialBeginEnd)
TRACE: The special beginnings regexp is: (see specialBeginEnd)
       \$\$|(?<!\\)\\\[|(?<!\$)(?<!\\)\$(?!\$)
TRACE: The overall special regexp is: (see specialBeginEnd)
       (?^usx:
                                         \$\$
                                         (?^usx:(?:                        # cluster-only (), don't capture 
                                               (?!             
                                                   (?:\$\$|(?<!\\)\\\[|(?<!\$)(?<!\\)\$(?!\$)) # cluster-only (), don't capture
                                               ).                     # any character, but not anything in $specialBegins
                                         )*?) 
                                         \$\$
                                  )|(?^usx:
                                         (?<!\\)\\\[
                                         (?^usx:(?:                        # cluster-only (), don't capture 
                                               (?!             
                                                   (?:\$\$|(?<!\\)\\\[|(?<!\$)(?<!\\)\$(?!\$)) # cluster-only (), don't capture
                                               ).                     # any character, but not anything in $specialBegins
                                         )*?) 
                                         \\\]
                                  )|(?^usx:
                                         (?<!\$)(?<!\\)\$(?!\$)
                                         (?^u:[^$]*?) 
                                         (?<!\\)\$(?!\$)
                                  )
TRACE: Constructing headings reg exp for example, chapter, section, etc (see indentAfterThisHeading)
       Not indenting after paragraph (see indentAfterThisHeading)
       Not indenting after section (see indentAfterThisHeading)
       Not indenting after chapter (see indentAfterThisHeading)
       Not indenting after part (see indentAfterThisHeading)
       Not indenting after subparagraph (see indentAfterThisHeading)
       Not indenting after subsection* (see indentAfterThisHeading)
       Not indenting after subsection (see indentAfterThisHeading)
       Not indenting after subsubsection (see indentAfterThisHeading)
TRACE: Looping through array for commandCodeBlocks->stringsAllowedBetweenArguments
       node
       at
       to
       decoration
       \+\+
       \-\-
       \#\#\d
TRACE: Strings allowed between arguments: (?^u:node|at|to|decoration|\+\+|\-\-|\#\#\d) (see stringsAllowedBetweenArguments)
TRACE: Looping through array for commandCodeBlocks->stringsAllowedBetweenArguments
       node
       at
       to
       decoration
       \+\+
       \-\-
       \#\#\d
TRACE: Strings allowed between arguments: (?^u:node|at|to|decoration|\+\+|\-\-|\#\#\d) (see stringsAllowedBetweenArguments)
TRACE: Looping through array for commandCodeBlocks->commandNameSpecial
       @ifnextchar\[
TRACE: The special command names regexp is: (?^u:@ifnextchar\[) (see commandNameSpecial)
TRACE: Searching for NOINDENTBLOCK (see noIndentBlock)
       {
         cmhtest => 1,
         noindent => 1
       }

       looking for cmhtest:1 noIndentBlock
       looking for noindent:1 noIndentBlock
TRACE: Searching for VERBATIM commands (see verbatimCommands)
       {
         lstinline => 1,
         verb => 1
       }

       looking for verb:1 Commands
       looking for lstinline:1 Commands
TRACE: Storing trailing comments
       No trailing comments found
TRACE: Searching for VERBATIM environments (see verbatimEnvironments)
       {
         lstlisting => 1,
         minted => 1,
         verbatim => 1
       }

       looking for verbatim:1 environments
       looking for lstlisting:1 environments
       looking for minted:1 environments
TRACE: Verbatim storage:
       {}

TRACE: Horizontal space removal routine
TRACE: Searching for FILE CONTENTS environments (see fileContentsEnvironments)
       {
         filecontents => 1,
         "filecontents*" => 1
       }

       looking for filecontents* environments
       looking for filecontents environments
TRACE: Removing leading space from mainDocument (verbatim/noindentblock already accounted for)
INFO:  Phase 1: searching for objects
TRACE: looking for ENVIRONMENTS
TRACE: looking for IFELSEFI
TRACE: looking for HEADINGS (chapter, section, part, etc)
       No objects found.
INFO:  Phase 2: finding surrounding indentation
TRACE: FamilyTree before update:
       {}

TRACE: Updating FamilyTree...
TRACE: FamilyTree after update:
       {}

INFO:  Phase 3: indenting objects
       No child objects (mainDocument)
INFO:  Phase 4: final indentation check
TRACE: Horizontal space removal routine
       Removing trailing white space *after* the document is processed (see removeTrailingWhitespace: afterProcessing)
TRACE: Replacement mode *after* indentation: -r
INFO:  Output routine:
       Outputting to file 440.tex
       --------------
INFO:  Please direct all communication/issues to:
        https://github.com/cmhughes/latexindent.pl
cmhughes commented 7 months ago

Can you try

replacements:
  -
    this: X
    that: Y
fengzyf commented 7 months ago

The actual output is still

YYYaaa
YaYa
cmhughes commented 7 months ago

I believe this may be a windows line break issue.

Try adding any character at the end of your file and run again.

On Wed, 3 Jan 2024, 14:13 fengzyf, @.***> wrote:

The actual output is still

YYYaaa YaYa

— Reply to this email directly, view it on GitHub https://github.com/cmhughes/latexindent.pl/issues/503#issuecomment-1875435698, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ7CYH54RNNIAC552JT53LYMVRPFAVCNFSM6AAAAABA226DXKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZVGQZTKNRZHA . You are receiving this because you modified the open/close state.Message ID: @.***>

fengzyf commented 7 months ago

Yes, that's true. After adding a space at the end of my file, the output is

YYYaaa
YaYaYYY
fengzyf commented 7 months ago

I believe this may be a windows line break issue. Try adding any character at the end of your file and run again.

I found the cause of the problem from the introduction of split in the official documentation

split /PATTERN/,EXPR,LIMIT

If LIMIT is negative, it is treated as if it were instead arbitrarily large; as many fields as possible are produced.

If LIMIT is omitted (or, equivalently, zero), then it is usually treated as if it were instead negative but with the exception that trailing empty fields are stripped (empty leading fields are always preserved); if all fields are empty, then all fields are considered to be trailing (and are thus stripped in this case). Thus, the following:

my @x = split(/,/, "a,b,c,,,"); # ("a", "b", "c")

produces only a three element list.

my @x = split(/,/, "a,b,c,,,", -1); # ("a", "b", "c", "", "", "")

produces a six element list.

As another special case, split emulates the default behavior of the command line tool awk when the PATTERN is either omitted or a string composed of a single space character (such as ' ' or "\x20", but not e.g. / /). In this case, any leading whitespace in EXPR is removed before splitting occurs, and the PATTERN is instead treated as if it were /\s+/; in particular, this means that any contiguous whitespace (not just a single space character) is used as a separator.

my @x = split(" ", "  Quick brown fox\n");
# ("Quick", "brown", "fox")

my @x = split(" ", "RED\tGREEN\tBLUE");
# ("RED", "GREEN", "BLUE")

Therefore, on line 74 of Replacement.pm

${$self}{body} = join( $that, split( $this, ${$self}{body} ) );

should be replaced with

${$self}{body} = join( $that, split( /$this/, ${$self}{body}, -1 ) );
cmhughes commented 7 months ago

Thanks, that's helpful. I've implemented your change as of https://github.com/cmhughes/latexindent.pl/commit/eaed6db975b9573fe26c071ca839bfaeb2d10e68 I'll get it released soon. thanks again

fengzyf commented 7 months ago

I appreciate it. Glad I could help.

cmhughes commented 7 months ago

released at https://github.com/cmhughes/latexindent.pl/releases/tag/V3.23.6, thanks again