Closed LorenzBuehmann closed 10 months ago
Thanks for creating the issue. PR #9136 should fix this
I initially focused on RDF in the tests - it would be sensible to extend them to RDFS and OWL, though Would you like to give this a go, e.g., based on the example OWL file you used? If yes, I am happy to support/ review
System Info
langchain = 0.0.251 Python = 3.10.11
Who can help?
No response
Information
Related Components
Reproduction
dbpedia_sample.ttl
with the following::Actor a owl:Class ; rdfs:comment "An actor or actress is a person who acts in a dramatic production and who works in film, television, theatre, or radio in that capacity."@en ; rdfs:label "actor"@en ; rdfs:subClassOf :Artist ; owl:equivalentClass wikidata:Q33999 ; prov:wasDerivedFrom http://mappings.dbpedia.org/index.php/OntologyClass:Actor .
:AdministrativeRegion a owl:Class ; rdfs:comment "A PopulatedPlace under the jurisdiction of an administrative body. This body may administer either a whole region or one or more adjacent Settlements (town administration)"@en ; rdfs:label "administrative region"@en ; rdfs:subClassOf :Region ; owl:equivalentClass http://schema.org/AdministrativeArea, wikidata:Q3455524 ; prov:wasDerivedFrom http://mappings.dbpedia.org/index.php/OntologyClass:AdministrativeRegion .
:birthPlace a rdf:Property, owl:ObjectProperty ; rdfs:comment "where the person was born"@en ; rdfs:domain :Animal ; rdfs:label "birth place"@en ; rdfs:range :Place ; rdfs:subPropertyOf dul:hasLocation ; owl:equivalentProperty http://schema.org/birthPlace, wikidata:P19 ; prov:wasDerivedFrom http://mappings.dbpedia.org/index.php/OntologyProperty:birthPlace .
Expected behavior
The issue is that in the SPARQL queries getting the properties the
rdfs:comment
triple pattern always refers to the variable?cls
which obviously comes from copy/paste code.For example, getting the RDFS properties via
you can see that the
OPTIONAL
clause refers to?cls
, but it should be?rel
.The same holds for all other queries regarding properties.
The current status leads to a cartesian product of properties and all
rdfs:comment
vlaues in the dataset, which can be horribly large and of course leads to misleading and huge prompts (see the output of my sample in the "reproduction" part)