Letractively / rdflib

Automatically exported from code.google.com/p/rdflib
Other
0 stars 0 forks source link

n3 parse error on \u043d\u04... #23

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
input.n3 (based on a dbpedia entry):

<http://dbpedia.org/resource/Anna_Kournikova>
<http://dbpedia.org/property/abstract> "Anna Sergeyevna Kournikova
(Russian: \u0410\u043d\u043d\u0430
\u0421\u0435\u0440\u0433\u0435\u0435\u0432\u043d\u0430
\u041a\u0443\u0440\u043d\u0438\u043a\u043e\u0432\u0430 " .

% python -c "from rdflib.Graph import ConjunctiveGraph as C;
C().parse('anna.n3', format='n3')"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "build/bdist.linux-i686/egg/rdflib/Graph.py", line 828, in parse
  File "build/bdist.linux-i686/egg/rdflib/Graph.py", line 661, in parse
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/N3Parser.py", line
32, in parse
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3proc.py",
line 119, in parse
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3p.py", line
117, in parse
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3proc.py",
line 131, in onFinish
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3proc.py",
line 397, in literalFinish
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3proc.py",
line 477, in literal
  File "build/bdist.linux-i686/egg/rdflib/syntax/parsers/n3p/n3proc.py",
line 81, in unquote
rdflib.syntax.parsers.n3p.n3proc.ParseError: Illegal escape at: \u043d\u04...

cwm is able to read the file.

Original issue reported on code.google.com by drewpca on 24 Apr 2008 at 7:45

GoogleCodeExporter commented 9 years ago

Original comment by eik...@gmail.com on 10 Feb 2009 at 8:33

GoogleCodeExporter commented 9 years ago
This simpler n3 file has the same issue:

<http://example.com/article1> <http://example.com/title> "this word is in
\u201cquotes\u201d".

The nt parser can read it, but n3 can't. I was using n3 format for relative 
urls.

Original comment by drewpca on 11 Jun 2009 at 5:55

GoogleCodeExporter commented 9 years ago
Oops- n3 is supposed to have uppercase hex in \u and \U escapes, so this error 
is
correct. Although I wouldn't mind if we matched the cwm behavior and permitted 
it :)

Original comment by drewpca on 11 Jun 2009 at 6:19

GoogleCodeExporter commented 9 years ago

Original comment by eik...@gmail.com on 1 Feb 2010 at 8:34

GoogleCodeExporter commented 9 years ago
Fixed in r1755

Original comment by gromgull on 2 Feb 2010 at 9:46