The-Sequence-Ontology / MSO

Molecular Sequence Ontology
9 stars 5 forks source link

slow classification through ROBOT #9

Open mikebada opened 5 years ago

mikebada commented 5 years ago

Hi @cmungall -- @msinclair2 and I have been trying to figure out why invoking classification through ROBOT is so much slower than that invoked, e.g., within Protege. For example, MSO-SO_merged.owl, in the repository, takes 4-5 minutes to classify with HermiT within Protege but takes a whopping ~50 minutes to classify with HermiT invoked through ROBOT. We've been asking around, including Rebecca Tauber, but no one seems to be able to come up with much of any explanation. Have you observed this, and do you have any ideas?

Thanks, Mike

msinclair2 commented 5 years ago

@cmungall It takes longer than 4-5 minutes in Protege but the time is still much less than ROBOT. Nevertheless, it's too long using either program.

Take SO_unreasoned.owl, import MSO_unreasoned.owl, and then reason. Why so long and is there anything we can do?

Thanks in advance.

cmungall commented 5 years ago

I believe this was answered on the robot tracker, or somewhere. Dynamic vs. Materialized?

msinclair2 commented 5 years ago

Yes but I don't really know what that means.

What is Protege is doing under the hood vs ROBO? Aren't both just calling the reasoner through OWL API? Does it depend on configuration settings? The types of inferences precomputed?

What can we do to reduce reasoning time in general, or is that not possible given the number and type of axioms we have?

cmungall commented 5 years ago

Let's follow up on the original ticket, rather than retreading here: https://github.com/ontodev/robot/issues/368

msinclair2 commented 5 years ago

I asked at ontodev/robot#368 if there is a way to reduce reasoning time for SO_unreasoned.owl with MSO_unreasoned.owl imported (the merged file). @cmungall said:

you know my opinions here: staying in EL is good for machine reasoning and good for human reasoning

And @dosumis said:

judging from your screenshot, the patterns themselves don't look very scalable and I wonder if they are capturing what you really need.

I would expect to see universal restriction combined with existential in a closure pattern, otherwise the genus (sequence molecular entity extent) doesn't have to have any parts (of the specified kind) in order to fulfil the restriction . Universal restriction doesn't really make sense with a transitive object property. In this case, all parts of parts would need to be of the specified type. Only, OR and NOT are all outside of EL, so scaling will be worst-case intractable.

I'll also report that @lschriml at DiseaseOntology has said that MSO has to classify properly with ELK before she decides to use it in DO.

Mike, I'd like to bring you back in here as far as the architecture of the ontology is concerned.

dosumis commented 5 years ago

For reference - here's the screenshot discussed:

image

dosumis commented 5 years ago

My points here are also made in discussion on ticket #6 . Looks like some productive discussion happening there.

msinclair2 commented 5 years ago

Yes, Mike had done a bunch of work on the nucleotide extent classes. And when he did that, it did reduce reasoning time.

cmungall commented 5 years ago

On 14 Dec 2018, at 9:15, Michael Sinclair wrote:

I'll also report that @lschriml at DiseaseOntology has said that MSO has to classify properly with ELK before she decides to use it in DO.

Note that in OBO the release versions of all ontologies should be reasoned in advance, so it shouldn't matter if the profile of an import

profile of importing.

However, there is some potential issue here with the base module strategy. This is moot for MSO as it doesn't provide a base. And I think here just pre-classify to axioms not entailable with >EL. [sorry of this is a bit abstract you don't need to worry here but flagging for others]

I am a very strong supporter of DL axioms where they are required. But I feel they are often deployed where they don't need to be, and the end result can be worse as it's harder for humans to reason about intent. Where they do need to be deployed we can often isolate them into modules and pre-reason over these (EL-shunt pattern, I will write this up at some stage). Note that GO, RO, Uberon and other ontologies make some use of non-EL axioms and use this strategy.