fedora-static-analysis / firehose

Interchange format for results for static analysis tools
63 stars 18 forks source link

XML <----> Unicode issues #12

Closed paultag closed 11 years ago

paultag commented 11 years ago

Filing this issue as a placeholder so I don't forget.

Traceback (most recent call last):
  File "/home/tag/dev/local/storz/utils/../wrappers/storz-buildd-log-parser", line 75, in <module>
    print obj.to_xml_str()
  File "/home/tag/dev/local/firehose/firehose/report.py", line 90, in to_xml_str
    xml.write(output, encoding='utf-8')
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 821, in write
    serialize(write, self._root, encoding, qnames, namespaces)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 940, in _serialize_xml
    _serialize_xml(write, e, encoding, qnames, None)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 940, in _serialize_xml
    _serialize_xml(write, e, encoding, qnames, None)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 940, in _serialize_xml
    _serialize_xml(write, e, encoding, qnames, None)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 938, in _serialize_xml
    write(_escape_cdata(text, encoding))
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1074, in _escape_cdata
    return text.encode(encoding, "xmlcharrefreplace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 0: ordinal not in range(128)

Unknown source, may be coming from upstream xml module. Attempted to work around by setting encoding to UTF-8. Log uploaded to http://public.pault.ag/m68k_1180772182_log.bz2 .

Filing against Firehose for now, but the issue may be elsewhere.

davidmalcolm commented 11 years ago

Looks like you have a local change:

File "/home/tag/dev/local/firehose/firehose/report.py", line 90, in to_xml_str
    xml.write(output, encoding='utf-8')

in my firehose/report.py line 90 I have just:

   xml.write(output)
paultag commented 11 years ago

Yep, that was while trying to get around the issue, I made an offhand note of that in my bug report :)

Attempted to work around by setting encoding to UTF-8

(perhaps a bit unclear)

Happens with pristine firehose too. I think it might be upstream.

davidmalcolm commented 11 years ago

Does it work with the latest code? (i.e. after all our Python 3 fixes)

davidmalcolm commented 11 years ago

Paul, should I close this out, or is this still an issue?

paultag commented 11 years ago

Uch. Not sure. I'm still getting odd Unicode failures in some chroots with the Python 3 test suite, but I can't pinpoint it yet.

I'm going to close this until I get a real bug. Sorry :)