marianoguerra / rst2html5

transform restructuredtext documents to html5 + twitter's bootstrap css, deck.js or reveal.js
http://marianoguerra.github.com/rst2html5
MIT License
177 stars 51 forks source link

Crash with non-ASCII characters in attributes #73

Closed torfsen closed 8 years ago

torfsen commented 8 years ago
$ python --version
Python 2.7.6

$ cat rst2html5_bug.rst 
.. figure:: image.png
    :alt: ö

$ rst2html5 --traceback rst2html5_bug.rst
Traceback (most recent call last):
  File "/home/torf/projects/statika/venv/bin/rst2html5", line 9, in <module>
    load_entry_point('rst2html5-tools==0.2.6', 'console_scripts', 'rst2html5')()
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/main.py", line 33, in main
    description=description)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/docutils/core.py", line 352, in publish_cmdline
    config_section=config_section, enable_exit_status=enable_exit_status)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/docutils/core.py", line 219, in publish
    output = self.writer.write(self.document, self.destination)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/docutils/writers/__init__.py", line 80, in write
    self.translate()
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/__init__.py", line 201, in translate
    tree = visitor.get_tree()
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/__init__.py", line 487, in get_tree
    return Html(self.head, self.root)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 212, in __init__
    TagBase.__init__(self, childs, attrs)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 60, in __init__
    escape_attrs(self)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 43, in escape_attrs
    escape_attrs(child)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 43, in escape_attrs
    escape_attrs(child)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 43, in escape_attrs
    escape_attrs(child)
  File "/home/torf/projects/statika/venv/local/lib/python2.7/site-packages/html5css3/html.py", line 40, in escape_attrs
    for (key, val) in node.attrib.items()])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 0: ordinal not in range(128)

The problem seems to be that html5css3.html.escape_attrs uses str instead of unicode to convert values into strings. If I replace the str(value) with unicode(value) things work again, although I didn't run many tests.

marianoguerra commented 8 years ago

hi, thanks for the report!

can you submit a pull request with the proposed change so I can merge it?

if it breaks something else we fix it forward :)

torfsen commented 8 years ago

I'll prepare PR, although it may take me time since things are pretty busy here at the moment.