mikitex70 / plantuml-markdown

PlantUML plugin for Python-Markdown
BSD 2-Clause "Simplified" License
192 stars 55 forks source link

ascii/unicode error while processing plantuml diagrams containing unicode characters to svg_inline format #21

Closed wilkolazki closed 5 years ago

wilkolazki commented 5 years ago

Hi,

I have stumbled upon an issue with non-ascii characters in plantuml diagrams embedded in the md files.

Error reading page 'README.md': 'ascii' codec can't encode character u'\u015b' in position 85: ordinal not in range(128)

The character in question is ś, but this happpens with any Polish characters.

Let's assume following aaa.md file:

# some md file with diagram
```plantuml
@startuml
Alicja -> Łukasz: "Zażółć gęślą jaźń"
@enduml
```

And following config:

plantuml-markdown:
  format: svg_inline

Trying to execute markdown_py resutlts with following error:

$ markdown_py -x plantuml-markdown -c config.yml  aaa.md  > aaa.html
Successfuly imported extension module "plantuml-markdown".
Successfully loaded extension "plantuml-markdown.PlantUMLMarkdownExtension".
Traceback (most recent call last):
  File "/usr/local/bin/markdown_py", line 11, in <module>
    sys.exit(run())
  File "/usr/local/lib/python2.7/dist-packages/markdown/__main__.py", line 138, in run
    markdown.markdownFromFile(**options)
  File "/usr/local/lib/python2.7/dist-packages/markdown/core.py", line 411, in markdownFromFile
    kwargs.get('encoding', None))
  File "/usr/local/lib/python2.7/dist-packages/markdown/core.py", line 338, in convertFile
    html = self.convert(text)
  File "/usr/local/lib/python2.7/dist-packages/markdown/core.py", line 265, in convert
    self.lines = prep.run(self.lines)
  File "/usr/local/lib/python2.7/dist-packages/plantuml-markdown.py", line 113, in run
    text, did_replace = self._replace_block(text)
  File "/usr/local/lib/python2.7/dist-packages/plantuml-markdown.py", line 165, in _replace_block
    img = etree.fromstring(data)
  File "<string>", line 124, in XML
UnicodeEncodeError: 'ascii' codec can't encode character u'\u0141' in position 1714: ordinal not in range(128)

This error does not appear when using other formats, such as: jpg,png,svg,svg_object. It errors only for svg_inline format.

The error appears regardles of using with plantuml binary or server.

 # pip show plantuml-markdown
Name: plantuml-markdown
Version: 2.0.1
...

# pip show plantuml
Name: plantuml
Version: 0.1.1
...

# pip show markdown
Name: Markdown
Version: 3.0.1
...

# python --version
Python 2.7.12
...

# uname -a
Linux LU-SWILKOLAZKI 4.15.0-43-generic #46~16.04.1-Ubuntu SMP Fri Dec 7 13:31:08 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Let me know if any other information is needed to resolve this issue.

mikitex70 commented 5 years ago

I'm unable to reproduce the error; are you using a Windows platform? Your pull request seems ok, no regression on existing code so I think to merge it.

mikitex70 commented 5 years ago

Closing, as bug is now fixed in the new release 2.0.2. Thanks @wilkolazki .