RDFLib / sparqlwrapper

A wrapper for a remote SPARQL endpoint
https://sparqlwrapper.readthedocs.io/
Other
513 stars 121 forks source link

Variable order in rdflib.query.QueryResult.serialize() #169

Open chiarcos opened 3 years ago

chiarcos commented 3 years ago

I might be missing something, but in general, I would expect the order of variables from a SELECT statement to be preserved in, say, format="txt". Instead, they seem to be ordered lexicographically. Can we recover the original order somehow?

    g = rdflib.Graph()
    g.parse(data=ttl, format="ttl")
    qres = g.query(\
        """SELECT ?p ?pred ?role ?a ?type 
        WHERE { 
            { ?p a ?pred. FILTER(strstarts(str(?pred),"http://purl.org/acoli/open-ie/props:"))
              OPTIONAL { ?p ?role ?a. FILTER(strstarts(str(?role),"http://purl.org/acoli/open-ie/roles:")) 
                OPTIONAL { ?a a ?type. FILTER(strstarts(str(?role),"http://purl.org/acoli/open-ie/")) }
              }
            } UNION {
              ?a a ?type. FILTER(strstarts(str(?role),"http://purl.org/acoli/open-ie/"))
              MINUS { [] ?role ?a. FILTER(strstarts(str(?pred),"http://purl.org/acoli/open-ie/roles:")) }
              OPTIONAL { ?a a ?type. FILTER(strstarts(str(?role),"http://purl.org/acoli/open-ie/")) }
            }
        } ORDER BY ?p ?role ?a ?type """)
    return qres.serialize(format="txt",namespace_manager=g.namespace_manager).decode("utf-8")

Sample output:

  a   |  p  |      pred      |        role        |type
-----------------------------------------------------
:s1_11|:s1_4|terms:props:_-01|terms:roles:ARGM-LOC|-
:s1_2 |:s1_4|terms:props:_-01|terms:roles:ARGM-TMP|-
:s1_3 |:s1_4|terms:props:_-01|terms:roles:ARG1    |-
:s1_7 |:s1_4|terms:props:_-01|terms:roles:ARGM-LOC|-

Expected output:

  p  |      pred      |        role        |  a   |type
-----------------------------------------------------
:s1_4|terms:props:_-01|terms:roles:ARGM-LOC|:s1_11|-
:s1_4|terms:props:_-01|terms:roles:ARGM-TMP|:s1_2 |-
:s1_4|terms:props:_-01|terms:roles:ARG1    |:s1_3 |-
:s1_4|terms:props:_-01|terms:roles:ARGM-LOC|:s1_7 |-

Sample data:

 # The 2010 Excavation Season at the Site of the Vicus ad Martis Tudertium (PG)
 @prefix : <file:///home/chiarcos/semantic-parsing/#> .
 @prefix powla: <http://purl.org/powla/powla.owl#> .
 @prefix conll: <http://ufal.mff.cuni.cz/conll2009-st/task-description.html#> .
 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
 @prefix terms: <http://purl.org/acoli/open-ie/> .
 @prefix x: <http://purl.org/acoli/conll-rdf/xml#> .
 @prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .
 @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

 :s1_2 terms:full_label "2010" ; terms:ids "2" ; terms:label "2010" ; terms:score 0.9296875 .
 :s1_3 a terms:types:Excavation ; terms:full_label "Excavation" ; terms:ids "3" ; terms:label "Excavation" ; terms:score 0.92578125 .
 :s1_4 a terms:props:_-01 ; terms:full_label "The 2010 Excavation Season at the Site of the Vicus ad Martis Tudertium PG" ; terms:roles:ARG1 :s1_3 ; terms:roles:ARGM-LOC :s1_11 , :s1_7 ; terms:roles:ARGM-TMP :s1_2 ; terms:score 0.984375 .
 :s1_7 a terms:props:_-01 ; terms:full_label "at the Site of the Vicus ad Martis Tudertium PG" ; terms:roles:ARG1 :s1_11 ; terms:score 0.91015625 .
 :s1_10 terms:full_label "Vicus" ; terms:ids "10" ; terms:label "Vicus" ; terms:score 0.7109375 .
 :s1_11 a terms:props:ad-01 ; terms:full_label "Vicus ad" ; terms:roles:ARG1 :s1_10 ; terms:score 0.76953125 .

(NB: I know there are a workarounds, it just seems unneccessary that such elementary information gets lost, so I wonder whether this was by intent and if so, what the objective has been.)

PritishWadhwa commented 2 years ago

Hi, I tried to replicate your issue, but it gave me the desired output only. Can you please help me replicate the issue so that I can try fixing it?