walidazizi / rdflib

Automatically exported from code.google.com/p/rdflib
Other
0 stars 0 forks source link

RDFXML Parser breaks when handed a unicode object #108

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
With this error: 

UnicodeEncodeError: 'ascii' codec can't encode character u'\xf4' in
position 328: ordinal not in range(128)

See http://code.google.com/p/rdflib/source/browse/trunk/test/test_issue084.py

(named after the issue for N3 parsing, which is fixed). 

The question is if we want to let you pass a unicode object with xml string
to the parser? 

Passing utf8 encoded strings works fine

Original issue reported on code.google.com by gromgull on 2 Feb 2010 at 9:39

GoogleCodeExporter commented 8 years ago
Where can I find the test you linked? Is this issue fixed?

Original comment by danielmr...@gmail.com on 30 Apr 2011 at 8:24

GoogleCodeExporter commented 8 years ago
How to test this?

assert str(BNode(u"ü")) == '\\xc3\\x83\\xc2\\xbc'
assert str(BNode(u"ü")) == u"ü"

The problem is still unsolved

Original comment by danielmr...@gmail.com on 9 May 2011 at 8:25

GoogleCodeExporter commented 8 years ago
The test is : 
http://code.google.com/p/rdflib/source/browse/test/test_issue084.py

I have to double check if this is fixed now - I think I was undecided what to 
do in the end.

Original comment by gromgull on 9 May 2011 at 8:29

GoogleCodeExporter commented 8 years ago
The decision in test_issue084.py is to encode the file before parsing:

rdfxml_utf8 = rdfxml.encode('utf-8')

This works fine if you pass a string to your parser because in 
rdflib.parser.create_input_source
        if isinstance(data, unicode):
            data = data.encode('utf-8')
is checked for "data" but is it checked for File or url input?

Original comment by danielmr...@gmail.com on 9 May 2011 at 8:42

GoogleCodeExporter commented 8 years ago
The test_issue84 documents the agreement that was reached. 

File or url input cannot be unicode - since bits crossing the wire have to be 
encoded somehow... 

Original comment by gromgull on 13 Jan 2012 at 10:10