neo4j-labs / neosemantics

Graph+Semantics: Import/Export RDF from Neo4j. SHACL Validation, Model mapping and more.... If you like it, please ★ ⇧
https://neo4j.com/labs/neosemantics/
Apache License 2.0
815 stars 142 forks source link

Importing ChEBI .owl ontology with n10s.rdf.import.fetch generates incorrect neo4j graph #324

Open penguinoctopus opened 3 months ago

penguinoctopus commented 3 months ago

I am trying to import .owl file from official ChEBI website which can be downloaded by:

curl -O https://ftp.ebi.ac.uk/pub/databases/chebi/ontology/chebi.owl

I am using Neo4j 5.20.0 and the neosemantics .jar found at:

https://github.com/neo4j-labs/neosemantics/releases

logging seems to be set up correctly (I am using slf4j-api-2.0.7.jar and slf4j-simple-2.0.7.jar) and I see no errors when starting neo4j with neo4j start. After running the following commands:

CALL n10s.graphconfig.init();
CALL n10s.rdf.import.fetch("file:///C:/Users/nikit/Downloads/chebi.owl", "RDF/XML");

In neo4j, the ontology is imported fine, no errors, but when I am trying to inspect the nodes and relationships - they are just wrong: it does not have many relationships, node properties that should be there. I asked the ChEBI support and inspected their .owl but on their side everything is fine, so I think it is neosemantics.jar issue here. Just as an example: nodes have no :ChEBI labels (but they should) and no has_role relationships:

https://www.ebi.ac.uk/chebi/searchId.do?chebiId=CHEBI:46195

I have also asked the question on neo4j forum:

https://community.neo4j.com/t/import-of-chebi-owl-seems-not-correct/68480

penguinoctopus commented 3 months ago

I found that has role relationships are encoded as nodes with the label owl__ObjectProperty. I tried then to write a query to utilize them to still walk over has role relationship, just in a different way given the imported graph:

MATCH (r:owl__Restriction)-[:owl__onProperty]->(n:owl__ObjectProperty {rdfs__label: 'has role'})
MATCH (r)-[:owl__someValuesFrom]->(c:owl__Class {rdfs__label: 'sedative'})
MATCH (c2)-[:rdfs__subClassOf]->(r:owl__Restriction)
RETURN c2

It seems to be working fine when I check ChEBI website search.