RDFLib / rdflib

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information.
https://rdflib.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.14k stars 555 forks source link

ParseError: does not recognize predicate as URI #909

Closed AlexisPister closed 4 years ago

AlexisPister commented 5 years ago

Hello,

I try to parse an nq file, but I get the following error when calling the parse function :

ParseError: Invalid line (Predicate must be uriref): 'soroosh@cs.toronto.edu http://xmlns.com/foaf/0.1/name "Soroosh Nalchigar" https://lov.linkeddata.es/dataset/lov .'

The triplet causing the error does not seem invalid to me, do you know where it could come from and how could I parse this type of triplet ?

Thanks

GillesVandewiele commented 5 years ago

Hmmm, I do count 4 chunks if I would split by space? Seems like it is trying to parse "Soroosh Nalchigar" as a predicate, which should always be a URI

AlexisPister commented 5 years ago

It is a nquad file that I try to parse, which is constituated of quads ("[object] [predicate] [subject] [graph]"). All the lines of my file are of this form, most of them do not cause any problem when parsing.

GillesVandewiele commented 5 years ago

Hmmm of course, sorry about that... I once had to parse a file with quads as well, which had to be done with a ConjuctiveGraph:

g = rdflib.ConjunctiveGraph()
g.parse(data=r.text, format='trig')

Source: https://medium.com/@gillesvandewiele/the-new-smart-flanders-api-a-demo-9921d8a05abd

parthrohilla commented 5 years ago

Even I am facing some problem when trying with nquad dataset. THE FOLLOWING ERROR, WHILE I TRY TO LOAD DATA USING RDFLIB -->g.load('C://Users//Parth//Downloads//eventkg_1.1.tar//data//output//events.nq',format="nquads")

Traceback (most recent call last):

File "", line 1, in g.load('C://Users//Parth//Downloads//eventkg_1.1.tar//data//output//events.nq',format="nquads")

File "C:\Users\Parth\Anaconda3\lib\site-packages\rdflib\graph.py", line 1050, in load self.parse(source, publicID, format)

File "C:\Users\Parth\Anaconda3\lib\site-packages\rdflib\graph.py", line 1043, in parse parser.parse(source, self, **args)

File "C:\Users\Parth\Anaconda3\lib\site-packages\rdflib\plugins\parsers\nquads.py", line 66, in parse raise ParseError("Invalid line (%s):\n%r" % (msg, __line))

ParseError: Invalid line (Subject must be uriref or nodeID): '@Prefix rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns# .'

jpmccu commented 5 years ago

The quad you gave:

soroosh@cs.toronto.edu http://xmlns.com/foaf/0.1/name "Soroosh Nalchigar" https://lov.linkeddata.es/dataset/lov .

Is not a valid quad. It should be:

<mailto:soroosh@cs.toronto.edu> <http://xmlns.com/foaf/0.1/name> "Soroosh Nalchigar" <https://lov.linkeddata.es/dataset/lov> .

Additionally, prefixes are not supported in nquads, only in trig.

nicholascar commented 4 years ago

This issue is not an issue - user error using n-quads format as indicated by @jimmccusker so closing