aehrc / snorocket

The Snorocket Description Logic classifier for EL++ with concrete domains support
Apache License 2.0
22 stars 6 forks source link

Reasoner taking a very long time to classify #7

Open davetrig opened 7 years ago

davetrig commented 7 years ago

Hi all, We've got an issue where we're classifying a client's extension of SNOMED CT, and the classify step is taking a very long time.

My process is: Classify SCT, and save the reasoner. When classifying the extension, load the SCT reasoner, add the appropriate axioms for the extension, and perform an incremental classification.

We've used this process on many other extensions of SCT and other ontologies, and performance is always excellent. However, in this case, classification is taking from 30-60 minutes. The extension includes about 11,000 concepts, about 13,000 defining concepts, and about 8,700 defining roles. There do not appear to be any cycles in the ontology.

I'm wondering if you are aware of any modeling conditions that might cause classification to take so long? I'm completely stumped at this point, because I'm not sure what I need to be looking for.

Thanks, Dave

davetrig commented 7 years ago

I should also mention that there's plenty of heap space. The Java max heap space is set to 12gb, and it seems to top out at about 7-8gb.

ametke commented 7 years ago

Hi Dave,

That seems way too long. We are not currently aware of any specific modelling conditions that might cause this, although we have seem some performance decrease in the presence of many equivalence axioms.

Have you tried doing a full classification of the extension without using incremental classification? It would be great if you could try that and let us know if it is also taking around the same time, so we can determine if it might be an issue with the incremental classification implementation.

Cheers,

Alejandro

lawley commented 7 years ago

Hi Dave,

To follow up further, can you clarify the 8,700 roles part? Unless I'm misunderstanding you that seems like an unusually large number.

Michael

Sent from my iPhone

On 24 Feb 2017, at 8:30 am, ametke notifications@github.com wrote:

Hi Dave,

That seems way too long. We are not currently aware of any specific modelling conditions that might cause this, although we have seem some performance decrease in the presence of many equivalence axioms.

Have you tried doing a full classification of the extension without using incremental classification? It would be great if you could try that and let us know if it is also taking around the same time, so we can determine if it might be an issue with the incremental classification implementation.

Cheers,

Alejandro

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

davetrig commented 7 years ago

I'll try a classification without the incremental as you suggest. Indeed, about 2,800 of the 11,000 concepts are "defined"; that is, they use equivalency axioms.

Is 8,700 defining roles a whole lot for 11,000 concepts? SCT has about 275k defining roles on about 425k concepts, right? It could be I'm just phrasing it differently. I'm referring to each individual role relating one concept to another - most concepts have multiple defining roles. As I mentioned, it's our client's model, so I can't be too sure. I do know that our old classifier finished in just a minute or two.

If you'd like to look at it, I could try to get clearance to send you a copy of the reasoner with all the axioms loaded.

Thanks for your help, Dave

lawley commented 7 years ago

Ah yes we're using different terminology:)

With that big of a delta, incremental will not buy you anything in terms of speed; the initialisation process has quite a big overhead per extra concept/role

Michael

Sent from my iPhone

On 24 Feb 2017, at 10:33 am, davetrig notifications@github.com wrote:

I'll try a classification without the incremental as you suggest. Indeed, about 2,800 of the 11,000 concepts are "defined"; that is, they use equivalency axioms.

Is 8,700 defining roles a whole lot for 11,000 concepts? SCT has about 275k defining roles on about 425k concepts, right? It could be I'm just phrasing it differently. I'm referring to each individual role relating one concept to another - most concepts have multiple defining roles. As I mentioned, it's our client's model, so I can't be too sure. I do know that our old classifier finished in just a minute or two.

If you'd like to look at it, I could try to get clearance to send you a copy of the reasoner with all the axioms loaded.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

davetrig commented 7 years ago

Hi guys, I did a full classification of SCT + extension, without the incremental classification as suggested.
reasoner.classify() completes in about 46 seconds, and reasoner.getClassifiedOntology() takes about 6 seconds.

When performing an incremental classification of the extension added to a previously classified reasoner containing SCT, reasoner.classify() takes about 7 seconds, but reasoner.getClassifiedOntology() takes anywhere from 30-60 minutes.

So, the problem does seem to be somewhere in the incremental classification implementation.

I can provide the axioms that I'm using, if you have a suggested method of getting them to you.

Thanks, Dave