pre-reasoning performance improvements

stuckyb / ontopilot

15 stars 2 forks source link

pre-reasoning performance improvements #58

Closed stuckyb closed 7 years ago

stuckyb commented 7 years ago

Optimize performance for using ontopilot in pre-reasoning pipelines.

The current benchmark, using the test data from @jdeck88 in https://github.com/PlantPhenoOntology/ppo_pre_reasoner, is 272 seconds.

I have at least two changes in mind:

Add an option to skip consistency checking.
Execute axiom cleanup algorithms based on the kinds of pre-reasoning that were performed.

stuckyb commented 7 years ago

After doing some testing, I now realize that option 1 won't help much, at least not for reasoning over the ABox. It appears that if you do not explicitly check an ontology for consistency, the reasoner will basically do this automatically, so that there is almost no meaningful time savings. This is at least true with HermiT and individual type inferences.

stuckyb commented 7 years ago

I implemented suggested change 2 by having OntoPilot only do subclass axiom redundancy cleanup if class subsumption inferencing was performed. Using @jdeck88's test data, this resulted in massive time savings, reducing the total inference time to about 119 seconds (less than 50% of the original time).

stuckyb commented 7 years ago

I'm considering this done. With the changes above and additional features such as pre-reasoning inverse property assertions, OntoPilot's reasoning performance is now fast and suitable for ingesting large data sets.