cmhughes / latexindent.pl

Perl script to add indentation (leading horizontal space) to LaTeX files. It can modify line breaks before, during and after code blocks; it can perform text wrapping and paragraph line break removal. It can also perform string-based and regex-based substitutions/replacements. The script is customisable through its YAML interface.
GNU General Public License v3.0
864 stars 84 forks source link

Incorrect encoding (possiblely due to #505) #547

Closed qiancy98 closed 2 months ago

qiancy98 commented 2 months ago

Please provide the following when posting an issue:

original .tex code

Please paste your .tex code here. Please note in answering your issue, I may add the code you provide to the test-cases directory. Please detail explicitly if you would prefer me not to do so..

yaml settings

encoding: GB2312 paths:

actual/given output

file D:\Google 锟狡讹拷硬锟斤拷\锟斤拷锟斤拷师锟斤拷锟斤拷锟斤拷\Hydra_QCY - partial (Edition 0821)\hydra-2 240525

INFO:  latexindent.pl version 3.24.1, 2024-05-12, a script to indent .tex files
       latexindent.pl lives here: C:/Users/qcy-5/AppData/Local/Programs/MiKTeX/scripts/latexindent/
       Fri Jun  7 22:53:27 2024
       Filename: d:/Google �ƶ�Ӳ��/����ʦ������/Hydra_QCY - partial (Edition 0821)/hydra-2 240525/__latexindent_temp_hydra2.tex
INFO:  Processing switches:
       -y|--yaml: YAML settings specified via command line
       -c|--cruft: cruft directory
INFO:  Directory for backup files and log file d:\Google �ƶ�Ӳ��\����ʦ������\Hydra_QCY - partial (Edition 0821)\hydra-2 240525\indent.log:
       d:\Google �ƶ�Ӳ��\����ʦ������\Hydra_QCY - partial (Edition 0821)\hydra-2 240525\
INFO:  Perl modules are being loaded from the following directories:
       C:/Strawberry/perl/lib/FindBin.pm
       C:/Strawberry/perl/vendor/lib/YAML/Tiny.pm
       C:/Strawberry/perl/lib/File/Copy.pm
       C:/Strawberry/perl/lib/File/Basename.pm
       C:/Strawberry/perl/lib/Getopt/Long.pm
       C:/Strawberry/perl/vendor/lib/File/HomeDir.pm
INFO:  LatexIndent perl modules are being loaded from, for example:
       C:/Users/qcy-5/AppData/Local/Programs/MiKTeX/scripts/latexindent/LatexIndent/Document.pm
INFO:  YAML settings read: defaultSettings.yaml
       Reading defaultSettings.yaml from C:/Users/qcy-5/AppData/Local/Programs/MiKTeX/scripts/latexindent/defaultSettings.yaml
INFO:  YAML reading settings
       The config file in "C:\Users\qcy-5/indentconfig.yaml" will be read
       Reading path information from C:\Users\qcy-5/indentconfig.yaml
       ---
       encoding: GB2312
       paths:
         - 'D:\Google 云端硬盘\资料\33 常用软件\其他脚本\安装\MiKTeX\latexindent.yaml'

INFO:  Encoding of the paths is GB2312
       Transform file encoding: D:\Google 云端硬盘\资料\33 常用软件\其他脚本\安装\MiKTeX\latexindent.yaml -> D:\Google ÔƶËÓ²ÅÌ\×ÊÁÏ\33 ³£ÓÃÈí¼þ\ÆäËû½Å±¾\°²×°\MiKTeX\latexindent.yaml
INFO:  YAML settings, reading from the following files:
       Reading USER settings from D:\Google ÔƶËÓ²ÅÌ\×ÊÁÏ\33 ³£ÓÃÈí¼þ\ÆäËû½Å±¾\°²×°\MiKTeX\latexindent.yaml
       ---
       indentAfterItems:
         case: '1'
       indentRules:
         item: \t
       lookForAlignDelims:
         align:
           alignDoubleBackSlash: '0'
           delims: '1'
           spacesBeforeAmpersand:
             leadingBlankColumn: '0'
         align*:
           alignDoubleBackSlash: '0'
           delims: '1'
           spacesBeforeAmpersand:
             leadingBlankColumn: '0'
         array:
           alignFinalDoubleBackSlash: '1'
           alignRowsWithoutMaxDelims: '0'
           delims: '1'
         bordermatrix: '1'
         bordmatrix: '1'
       modifyLineBreaks:
         environments:
           BeginStartsOnOwnLine: '1'
           BodyStartsOnOwnLine: '1'
           DBSFinishesWithLineBreak: '1'
           EndFinishesWithLineBreak: '1'
           EndStartsOnOwnLine: '1'
           equation*:
             BeginStartsOnOwnLine: '1'
             BodyStartsOnOwnLine: '1'
             EndFinishesWithLineBreak: '1'
             EndStartsOnOwnLine: '1'
         items:
           ItemStartsOnOwnLine: '1'
         mandatoryArguments:
           array:
             RCuBFinishesWithLineBreak: '1'
           bordermatrix:
             MandArgBodyStartsOnOwnLine: '1'
             RCuBFinishesWithLineBreak: '1'
             RCuBStartsOnOwnLine: '1'
           label:
             RCuBFinishesWithLineBreak: '1'
           tabular:
             RCuBFinishesWithLineBreak: '1'
         specialBeginEnd:
           displayMath:
             SpecialBeginStartsOnOwnLine: '1'
             SpecialBodyStartsOnOwnLine: '1'
             SpecialEndFinishesWithLineBreak: '1'
             SpecialEndStartsOnOwnLine: '1'
       verbatimEnvironments:
         tikzpicture: '1'

WARN:  modifyLineBreaks specified and m switch is *not* active
       perhaps you intended to call
            latexindent.pl -m -l D:\Google ÔƶËÓ²ÅÌ\×ÊÁÏ\33 ³£ÓÃÈí¼þ\ÆäËû½Å±¾\°²×°\MiKTeX\latexindent.yaml d:/Google �ƶ�Ӳ��/����ʦ������/Hydra_QCY - partial (Edition 0821)/hydra-2 240525/__latexindent_temp_hydra2.tex
INFO:  YAML settings read: -y switch
       YAML setting: defaultIndent:'\t'
       single-quoted string found in -y switch: '\t', substitute to     
       Updating mainSettings with defaultIndent:    
FATAL I couldn't find d:/Google �ƶ�Ӳ��/����ʦ������/Hydra_QCY - partial (Edition 0821)/hydra-2 240525/__latexindent_temp_hydra2.tex, are you sure it exists?
      Exiting, no indentation done.
       --------------
INFO:  Please direct all communication/issues to:
        https://github.com/cmhughes/latexindent.pl

desired or expected output

The file located in D:\Google 云端硬盘\吴老师的论文\Hydra_QCY - partial (Edition 0821)\hydra-2 240525\hydra2.tex being indented.

anything else

I am not sure if #505 breaks my program... (yesterday I found that my miktex packages was not updated since 2024/01, and after I updated my packages, the indent program breaks.)

Besides, this bug will not be reproduced on Ubuntu, as Ubuntu encodes everything in UTF-8, and windows encodes paths in GB2312.

cmhughes commented 2 months ago

@fengzyf any ideas?

fengzyf commented 2 months ago

The latest commit 3d1f44d uses Win32:: GetACP() to retrieve the code page of the Windows system, thereby obtaining the system default encoding. Users do not need to specify the encoding.

cmhughes commented 2 months ago

Implemented as of https://github.com/cmhughes/latexindent.pl/releases/tag/V3.24.2, thanks again to @fengzyf