Letractively / rdflib

Automatically exported from code.google.com/p/rdflib
Other
0 stars 0 forks source link

UnicodeEncodeError: 'ascii' codec can't encode character - N3Parser breaks when parsing unicode #84

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
What steps will reproduce the problem?

from codecs import getreader
from rdflib.Graph import Graph
from StringIO import StringIO

rdf = getreader('utf-8')(StringIO(u"""@prefix skos:
<http://www.w3.org/2004/02/skos/core#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix : <http://www.test.org/#> .

:world rdf:type skos:Concept;
    skos:prefLabel "World"@en.
:africa rdf:type skos:Concept;
    skos:prefLabel "Africa"@en;
    skos:broaderTransitive :world.
:CI rdf:type skos:Concept;
    skos:prefLabel "C\u00f4te d'Ivoire"@en;
    skos:broaderTransitive :africa.    
""".encode('utf-8')))

g = Graph()
g.parse(rdf, 'http://www.test.org', 'n3')
print len(g)

What is the expected output? What do you see instead?

Expect - 3
Actual - UnicodeEncodeError: 'ascii' codec can't encode character u'\xf4'
in position 1: ordinal not in range(128)

What version of the product are you using? On what operating system?

rdflib 2.4.2 / Windows Vista 64 bit / Python 2.4.4

Please provide any additional information below.

This breaks in the same way if you pass in a unicode file using
codecs.open('filename', 'r', 'utf-8')
It parses without error if you pass in a unicode file opened using
open('filename', 'r'), however, characters outside ASCII then get double
escaped. e.g. "\\xf4".

Original issue reported on code.google.com by anthonyg...@gmail.com on 24 Sep 2009 at 2:12

GoogleCodeExporter commented 9 years ago
Added test case in r1714.

Original comment by eik...@gmail.com on 24 Sep 2009 at 9:06

GoogleCodeExporter commented 9 years ago

Original comment by eik...@gmail.com on 1 Feb 2010 at 7:39

GoogleCodeExporter commented 9 years ago

Original comment by eik...@gmail.com on 1 Feb 2010 at 8:04

GoogleCodeExporter commented 9 years ago

Original comment by gromgull on 1 Feb 2010 at 11:54

GoogleCodeExporter commented 9 years ago
This is fixed for N3 in r1755 - but see issue 108

Original comment by gromgull on 2 Feb 2010 at 9:39

GoogleCodeExporter commented 9 years ago
This issue was updated by revision 71ece3659599.

Original comment by eik...@gmail.com on 30 Mar 2011 at 9:06

GoogleCodeExporter commented 9 years ago
This issue was updated by revision ff1300f36d53.

...

Original comment by eik...@gmail.com on 30 Mar 2011 at 9:06