chirun-ncl / chirun

A Python package providing the command line interface for building flexible and accessible content with Chirun.
https://chirun.org.uk/
Other
30 stars 4 forks source link

Use 'backslashreplace' errors method when decoding the stdout/err from pdflatex #271

Open christianp opened 1 month ago

christianp commented 1 month ago

pdflatex is called using the subprocess module, implicitly decoding the stdout and stderr streams as utf-8. We should add the errors="backslashreplace" argument so that any invalid characters are escaped instead of throwing an error.

And check where else subprocess is used so the same error doesn't happen elsewhere.

Lycanic commented 1 month ago

Begun implementation at https://github.com/Lycanic/chirun/tree/backslashreplace-271 Confirmed this allows pdflatex to compile missing unicode symbols in math mode, rather than erroring.

Other than the pdflatex compilation, subprocess is only used for the pdf2svg and epstopdf processes, and directly with subprocess.run rather than the more elaborate Popen.

Do we want this replacement in those imaging processes also? If so, what errors are we solving by doing so?

christianp commented 2 weeks ago

It's possible that any process could produce weird output, so I think you might as well use backslashreplace on them too.