plastex / plastex

plasTeX is a Python package that processes LaTeX documents into an XML-DOM-like object which can be used to generate various types of output.
Other
282 stars 78 forks source link

UnicodeDecodeError: invalid start byte #106

Closed fvanmaele closed 4 years ago

fvanmaele commented 5 years ago

I recieve the following errors on a longer .tex file:

  File "/home/nand/bin/plastex", line 142, in <module>
    main(sys.argv)
  File "/home/nand/bin/plastex", line 119, in main
    Renderer().render(document)
  File "/home/nand/.local/lib/python3.5/site-packages/plasTeX/Renderers/PageTemplate/__init__.py", line 380, in render
    BaseRenderer.render(self, document)
  File "/home/nand/.local/lib/python3.5/site-packages/plasTeX/Renderers/__init__.py", line 502, in render
    self.imager.close()
  File "/home/nand/.local/lib/python3.5/site-packages/plasTeX/Imagers/__init__.py", line 570, in close
    output = self.compileLatex(self.source.read())
  File "/home/nand/.local/lib/python3.5/site-packages/plasTeX/Imagers/__init__.py", line 621, in compileLatex
    line = p.stdout.readline()
  File "/usr/lib/python3.5/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa4 in position 725: invalid start byte

https://gist.github.com/3661f05bd02bac0ada8a47c09e70887a (Note: This is a concatenated file with latexpand, due to WARNING: Could not find any file named <plasTeX.TeXFragment object at 0xb5c7723c> errors.)

Complete output: https://gist.github.com/37d4cbe5764ded697bab58b285bee90c pdflatex output: https://gist.github.com/fvanmaele/8518908fdf7be9d3faf504b94020012a

It's hard to identify where these errors originate; removing "äquivalent" from https://gist.github.com/fvanmaele/3661f05bd02bac0ada8a47c09e70887a#file-alggeo1-flt-L1065 appears to result in a succesful process, though "äquivalent" is used in other places.

PatrickMassot commented 4 years ago

I think the presence of "äquivalent" is a red herring. The UnicodeDecodeError arises when plasTeX is trying to read the output of latex trying to compile your math, before plasTeX will try to turn it into images. The core reason while this fails seem to be plasTeX messing up with xypic code before calling LaTeX. Specifically it doesn't seem to like \xymatrix@C, and turns it into xymatrix \C. I have no idea why, and probably only Kevin could answer.

@kesmit13 a minimized example is:

\documentclass{article}

\usepackage[all]{xy}

\begin{document}
\[
  \xymatrix@C=9pc{X\ar[r]^{f}_{f_{1},\ldots,f_{n}\in k[T_{1},\ldots,T_{m}]} & Y\ar[r]^{g}_{g_{1},\ldots,g_{r}\in k[T_{1}',\ldots,T_{m}']} & Z}
\]
\end{document}

The generated images.tex contains

\[  \xymatrix \C =9pc{X\ar [r]^{f}_{f_{1},\ldots ,f_{n}\in k[T_{1},\ldots ,T_{m}]} &  Y\ar [r]^{g}_{g_{1},\ldots ,g_{r}\in k[T_{1}',\ldots ,T_{m}']} &  Z}  \]

I'll try to improve on error reporting here (one cannot say the current behavior is very enlightening). But I must say I don't see much future in continuing using this obsolete package. You would have much more fun using tikzcd, which is also easier to use with plasTeX. I also don't see much future in using the XHTML renderer and its image creation strategy, but that's more debatable I guess.

PatrickMassot commented 4 years ago

Fixed in 89d0f26. This was actually the same issue as #36.