sphinx-contrib / plantuml

BSD 2-Clause "Simplified" License
113 stars 40 forks source link

sphinxcontrib plantuml does not detect when a plantuml.jar falls back to png, leading to an exception #54

Open oharboe opened 3 years ago

oharboe commented 3 years ago

This is with sphinxcontrib-plantuml 0.20.1.

Problem: plantuml.jar will silently fall back to .jar files. This can cause exceptions in sphinxcontrib-plantuml.

Encoding error:
'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte
The full traceback has been saved in /tmp/sphinx-err-7141u_el.log, if you want to report the issue to the developers.
File ".tox/py3/lib/python3.8/site-packages/sphinxcontrib/plantuml.py", line 522, in html_visit_plantuml
    self.body.append(gettag(self, fnames, node))
  File ".tox/py3/lib/python3.8/site-packages/sphinxcontrib/plantuml.py", line 478, in _get_svg_obj_tag
    (self.encode(refname), _get_svg_style(outfname) or ''))
  File ".tox/py3/lib/python3.8/site-packages/sphinxcontrib/plantuml.py", line 434, in _get_svg_style
    for l in f:
  File "/usr/lib/python3.8/codecs.py", line 714, in __next__
    return next(self.reader)
  File "/usr/lib/python3.8/codecs.py", line 645, in __next__
    line = self.readline()
  File "/usr/lib/python3.8/codecs.py", line 558, in readline
    data = self.read(readsize, firstline=True)
  File "/usr/lib/python3.8/codecs.py", line 504, in read
    newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Example of silent fallback to png in plantuml:

$ ls
foo.txt  plantuml.1.2021.4.jar
$ java -jar plantuml.1.2021.4.jar -tsvg foo.txt 
$ ls -lrt
total 8284
-rw-rw-r-- 1 oyvind oyvind     984 april 22 14:00 foo.txt
-rw-rw-r-- 1 oyvind oyvind 8459897 april 22 14:00 plantuml.1.2021.4.jar
-rw-rw-r-- 1 oyvind oyvind   12545 april 22 14:01 foo.svg
$ hexdump foo.svg  | head -n 5
0000000 5089 474e 0a0d 0a1a 0000 0d00 4849 5244
0000010 0000 3e03 0000 e000 0208 0000 4d00 2475
0000020 004d 3000 49c8 4144 7854 ed5e 7ddd 546c
0000030 fa55 f007 2943 a57d 2fa5 220b d950 bcb6
0000040 1468 a50a 69b8 2a41 bb2f 6318 254d 04bc
$ cat foo.txt
@startditaa

+----+--------+                                                 +----+--------+
| clang       |                                                 | x86         |
| c/C++       +------+                                 +------->+ backend     |
+------+------+      |                                 |        +------+------+
                     |           +----+-------+        |        +----+--------+
                     +---------->+ common     +--------+------->+ ARM         |
                     |           | optimizer  |        |        | backend     |
                     |           +------------+        |        +------+------+
+----+------------+  |                                 |        +----+--------+
| other frontend  +--+                                 +------->+ Aptos       |
| fortran/rust/...|                                             | backend     |
+------+----------+                                             +------+------+
oharboe commented 3 years ago

A quick and dirty workaround for poisoned .svg files containing .png data.

def _get_svg_obj_tag(self, fnames, node):
    refname, outfname = fnames['svg']
    # copy width/height style from <svg> tag, so that <object> area
    # has enough space.
    try:
        _get_svg_style(outfname)
    except Exception:
        # quick and dirty fallback to using image tag when a non-svg file was generated with a .svg extension
        return _get_svg_img_tag(self, fnames, node)
    return ('<object data="%s" type="image/svg+xml" style="%s"></object>' %
            (self.encode(refname), _get_svg_style(outfname) or ''))
bavovanachte commented 1 year ago

Hi, I'm currently facing this issue as well but I don't have the possibility to implement this quickfix. Could this be fixed in the extension?

bavovanachte commented 1 year ago

What caused the issue specifically in my case was the inclusion of ditaa diagrams:

    .. uml::

        @startuml
        ditaa(--no-shadows,--no-separation)
           +-------------------------------------+
           |  +-------------------------------+  |
        o--+->| &BLABLA.123456790123456789012 |  |
              +-------------------------------+  |
              | BLABLABLABLA                  |  |
              +-------------------------------+  |
              | BLABLABLABLABLABLAB           |  |
              +-------------------------------+  |
              | BLA                           |  |
              +-------------------------------+  |
              | BLABLABLAB                    +--+
              +-------------------------------+
        @enduml

Removing all ditaa diagrams from the documentation resolved the issue for me.