geneontology / go-graphstore

Gene Ontology RDF GraphStore
3 stars 4 forks source link

Evaluate materialized inference strategy #6

Closed cmungall closed 7 years ago

cmungall commented 8 years ago

This ticket is to evaluate:

https://github.com/geneontology/go-graphstore/pull/2

cmungall commented 8 years ago
make rdfox.ttl
export JAVA_OPTS="-Xmx32G" && mkdir -p tmp && mv rdf/go-lego-merged.owl tmp/ && rdfox-cli --ontology=tmp/go-lego-merged.owl --data=rdf --threads=24 --reason --export=rdfox.ttl --inferred-only && mv tmp/go-lego-merged.owl rdf/
Loaded ontology from file in 84.594s
Imported ontology into RDFox rules in 26.284s
Set number of threads in 53.114s
Imported data files in 8.021s
Applied reasoning in 30.989s
Exported data to Turtle in 80.478s
balhoff commented 8 years ago

Here's a run which includes all the Noctua models in addition to the converted GAFs:

make rdfox.ttl
export JAVA_OPTS="-Xmx32G" && mkdir -p tmp && mv rdf/go-lego-merged.owl tmp/ && rdfox-cli --ontology=tmp/go-lego-merged.owl --data=rdf --threads=24 --reason --export=rdfox.ttl --inferred-only && mv tmp/go-lego-merged.owl rdf/
Loaded ontology from file in 84.21s
Imported ontology into RDFox rules in 26.872s
Set number of threads in 55.796s
Imported data files in 7.863s
Applied reasoning in 29.544s
Exported data to Turtle in 75.591s

Negligible difference.

cmungall commented 8 years ago

Can we also use owltools plus EMR to saturate tbox?

On Tuesday, October 11, 2016, Jim Balhoff notifications@github.com wrote:

Here's a run which includes all the Noctua models in addition to the converted GAFs:

make rdfox.ttl export JAVA_OPTS="-Xmx32G" && mkdir -p tmp && mv rdf/go-lego-merged.owl tmp/ && rdfox-cli --ontology=tmp/go-lego-merged.owl --data=rdf --threads=24 --reason --export=rdfox.ttl --inferred-only && mv tmp/go-lego-merged.owl rdf/ Loaded ontology from file in 84.21s Imported ontology into RDFox rules in 26.872s Set number of threads in 55.796s Imported data files in 7.863s Applied reasoning in 29.544s Exported data to Turtle in 75.591s

Negligible difference.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/geneontology/go-graphstore/issues/6#issuecomment-252944206, or mute the thread https://github.com/notifications/unsubscribe-auth/AADGObWhjODb1LeY9QJeDaxsAfgehCUzks5qy6VKgaJpZM4KRhfa .

balhoff commented 8 years ago

Some results from adding further ontologies to the tbox:

There are about 22 million asserted triples currently.

Using go-lego.owl (as before), about 16 million triples are inferred.

Adding the full RO to this, about 52 million triples are inferred.

Adding both Uberon and PO (in addition to RO), inferred triples jumps to 470 million! I was a little surprised by this. It seems like this would mainly increase inferred superclasses. I suppose I could look into what predicates are used in the new triples.

In all cases, reasoning is still a matter of a few minutes. It does take a little over an hour to load 470 million statements into Blazegraph.

cmungall commented 7 years ago

Closed with our documented strategy: https://docs.google.com/document/d/1sQnNoCmneLjZPsUc6iBgbEkuqhxihFn09u9RrpHcUc8/edit#