ftilmann / latexdiff

Compares two latex files and marks up significant differences between them. Releases on www.ctan.org and mirrors
GNU General Public License v3.0
518 stars 73 forks source link

latexdiff-fast: Wide character in print #267

Open mstsirkin opened 2 years ago

mstsirkin commented 2 years ago

The following is printed when using latexdiff-fast.

Wide character in print at ./latexdiff-fast line 495. Wide character in print at ./latexdiff-fast line 495. Wide character in print at ./latexdiff-fast line 503. Wide character in print at ./latexdiff-fast line 503. Wide character in print at ./latexdiff-fast line 503. Wide character in print at ./latexdiff-fast line 503.

reproduced with commit 4f17f7b188fdbd8f6477aaea4118d0416babf370 (HEAD -> master, origin/master, origin/HEAD)

ftilmann commented 2 years ago

In order to be able to diagnose, an MWE (old and new file) would be needed. How is the output? If this looks fine, you can ignore the error message. This error message points to a problem with the encoding - latexdiff might assume a different encoding than is used for the text. Have a look at the --encoding option.

mstsirkin commented 2 years ago

adding --encoding=utf8 did not help.

mstsirkin commented 2 years ago

do.zip

the attached zip file has old, new and the command line to run latexdiff.

mstsirkin commented 2 years ago

with respect to the output, it appears fine superficially but the file is huge, I could miss some issues. I wish specific line numbers in the input were listed, then I could check with more confidence.

One interesting point is that perl version of latexdiff does not have the issue (it's much slower on such a huge file that is why we are using latexdiff-fast).

mstsirkin commented 2 years ago

sorry missing preamble file. updated. do.zip

mstsirkin commented 2 years ago

worked some more to cut down the size: dosmall.zip

I can confirm that the output from latexdiff-fast (with the warning) and latexdiff (without the warning) are exactly the same.

mstsirkin commented 2 years ago

okay the following does seem to help to shut down the warning:

      ($fha,$fna)=tempfile("DiffA-XXXX") or die "_longestCommonSubsequence: Cannot open tempfile for sequence A";
      ($fhb,$fnb)=tempfile("DiffB-XXXX") or die "_longestCommonSubsequence: Cannot open tempfile for sequence B";

+ binmode($fha, ":utf8"); + binmode($fhb, ":utf8");

prepare sequence A

I have no idea whether this makes sense in this context.

ftilmann commented 2 years ago

Thanks for the digging. I guess I can include the binmode commands (though might need a bit of extra effort to make this work in case of other encodings). In any case good to hear the output is fine in any case.