temken / comparxiv

Compare two version of an arXiv preprint with a single command.
https://pypi.org/project/comparxiv/
MIT License
351 stars 25 forks source link

UnicodeDecodeError: 'utf-8' codec can't decode #13

Open lgl603 opened 4 years ago

lgl603 commented 4 years ago

When I run the command 'comparxiv 1710.10196v3', the following error occurs: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xd5 in position 19512: invalid continuation byte

temken commented 4 years ago

Wow, that's a big paper. 30MB.

I can reproduce the error on my machine. I will have to look into the temporary files, what's going wrong. I can give no guarantees that I can fix this particular paper. Some will simply not work, since the problem might lie within latexdiff.

I'll get back to you.

rudjer commented 3 years ago

this looks to be a very useful initiative, however I have something similar with: comparxiv 0904.2931v5

Malformed UTF-8 character: \x96 (unexpected continuation byte 0x96, with no preceding start byte) in pattern match (m//) at /Library/TeX/texbin/latexdiff line 1844. Malformed UTF-8 character (fatal) at /Library/TeX/texbin/latexdiff line 1844.