Closed chrisgdt closed 3 months ago
cc @mzuenni
So it seems that latexmk encodes its terminal output in Latin-1 instead of utf-8
. Weird/annoying.
A simple fix is probably to change read_text
to read_binary
(or so), and do the conversion to string 'manually' in python, where we can catch errors or try multiple encodings. Could you try that?
What platform are you running on? Just some linux?
I am curious where the non-utf8 encoding comes from. You could try doing everything as utf-8 always: https://stackoverflow.com/a/1253024/2716069
Yes, I forgot to mention my configuration, my apologize :
It seems like the stuff printed by pdflatex
is not actually encoded in Latin-1
but whatever latex is using internally, see https://tex.stackexchange.com/questions/131238/what-controls-the-encoding-of-the-latex-log-file-and-how-to-change-it. Unfortunately, that depends on stuff like the latex font used at the place where the error occured...
Anyway, back on topic. Why does Latin-1
seems to fix this? Well... the é
or è
likely appear in text and not stuff like mathcal
and at such a place you likely use a T1
font and luckily enough T1
is equal to Latin-1
for most stuff.
However, it's not really the right encoding... People not using T1
fonts would need a different fix. And in fact the right encoding doesn't even exists because stuff like matcal
does not even use a "real" encoding... and if multiple fonts are used the log file can contain errors for all of them. So i guess the fix here is to ignore encoding errors and live with weird looking error messages in such places.
fixed with 9ca3e8b ?
Hello !
After an update of the BAPCtools, I compiled my contest but encountered an error:
I did some investigations and found out that this issue appeared around the commit "fix hang", two weeks ago. More precisely, from latex.py, line 199.
The pdf does not compile and produces this error when there is an uncommon character in the problem contest, such as
é
orè
(often used in French). I did find a way to fix it, which is to modify line 199 by setting another encoding instead of utf-8, e.g,but since I am not sure whether this is a good idea to change the encoding, nor whether the issue is more grounded somewhere else, I prefer to open an issue instead of a PR to discuss it. Notice that replacing every
é
by\'e
does not work either.