EnvironmentOntology / gaz

An open source gazetteer constructed on ontological principles
Other
7 stars 5 forks source link

GAZ is inconsistent #35

Open matentzn opened 3 years ago

matentzn commented 3 years ago

In an attempt to set up a uniform set of quality control checks across OBO ontologies, we noticed that GAZ is currently inconsistent. Due to its size, its a bit hard to determine run the reasoner in protege, so here the explanation for the inconsistency:

Thing SubClassOf Nothing

Reason for inconsistency:

grassland area is ultimately classified as an immaterial entity:

undersea feature is ultimately classified as a material entity

Nothing can be both material and immaterial

Tualatin Mountains are instances of both of the above.

Which seems to come from a bad interaction between ENVO and GAZ. This may not solve the deeper modelling issue, but at least could drastically reduce the error severity: removing the type assertions on Tualatin Mountains.

cmungall commented 3 years ago

Well this is incoherent regardless of material vs immaterial. There are no undersea grassland areas.

Gaz should be marked obsolete or at least inactive. We have various plans to fix it but no resources. We cant in good conscience be recommending people use it over wikidata. It needs a health warning on the readme and the obo page

On Sat, Dec 26, 2020, 09:58 Nico Matentzoglu notifications@github.com wrote:

In an attempt to set up a uniform set of quality control checks across OBO ontologies, we noticed that GAZ is currently inconsistent. Due to its size, its a bit hard to determine run the reasoner in protege, so here the explanation for the inconsistency: Thing http://www.w3.org/2002/07/owl#Thing SubClassOf Nothing http://www.w3.org/2002/07/owl#Nothing Reason for inconsistency: grassland area is ultimately classified as an immaterial entity:

undersea feature is ultimately classified as a material entity

Nothing can be both material and immaterial

Tualatin Mountains are instances of both of the above.

Which seems to come from a bad interaction between ENVO and GAZ. This may not solve the deeper modelling issue, but at least could drastically reduce the error severity: removing the type assertions on Tualatin Mountains.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EnvironmentOntology/gaz/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOI7AEQHZSHUKKLBARLSWYP3NANCNFSM4VKCZ2CA .

cmungall commented 3 years ago

I bet anything with range in the definition is similarly misclassified

On Sun, Dec 27, 2020, 11:57 Chris Mungall cjmungall@lbl.gov wrote:

Well this is incoherent regardless of material vs immaterial. There are no undersea grassland areas.

Gaz should be marked obsolete or at least inactive. We have various plans to fix it but no resources. We cant in good conscience be recommending people use it over wikidata. It needs a health warning on the readme and the obo page

On Sat, Dec 26, 2020, 09:58 Nico Matentzoglu notifications@github.com wrote:

In an attempt to set up a uniform set of quality control checks across OBO ontologies, we noticed that GAZ is currently inconsistent. Due to its size, its a bit hard to determine run the reasoner in protege, so here the explanation for the inconsistency: Thing http://www.w3.org/2002/07/owl#Thing SubClassOf Nothing http://www.w3.org/2002/07/owl#Nothing Reason for inconsistency: grassland area is ultimately classified as an immaterial entity:

undersea feature is ultimately classified as a material entity

Nothing can be both material and immaterial

Tualatin Mountains are instances of both of the above.

Which seems to come from a bad interaction between ENVO and GAZ. This may not solve the deeper modelling issue, but at least could drastically reduce the error severity: removing the type assertions on Tualatin Mountains.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EnvironmentOntology/gaz/issues/35, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAAMMOI7AEQHZSHUKKLBARLSWYP3NANCNFSM4VKCZ2CA .

matentzn commented 3 years ago

Ouuf sorry you are right; I was a bit premature only looking at 1 explanation..

The results using 3 explanations look like more individuals can be the cause of the inconsistencies:

Tualatin Mountains

Municipality of Heinola (1)

Municipality of Heinola (2)

Axiom Impact

Axioms used 3 times

Axioms used 2 times

Axioms used 1 times

Ontologies used:

Who are the main stakeholders of GAZ to at least contemplate the migration of the useful parts in GAZ to Wikidata, and then maybe replace it with a "OBO wikidata module" or something like that?

cmungall commented 3 years ago

I think obsoletion is too strong. GAZ is mentioned in standards like MIxS. If we obsolete it, it will be unavailable in some browsers, confusing people. (although behavior is not consistent across browsers here)

I think that MIxS6 should recommend wikidata over GAZ. I made a ticket: https://github.com/GenomicsStandardsConsortium/mixs/issues/116

I don't think we should obsolete before MIxS ceases to recommend. More generally I think we need a policy analogous to term obsoletion that is user-focused. An ontology SHOULD not be obsoleted IF there exists a non-obsolete standard S that references O. Maybe fairsharing can help?

However, GAZ should absolutely be marked inactive in OBO. It currently says it is active which is definitely not true!

One day I hope to merge enrich wikidata with GAZ https://github.com/cmungall/environments2wikidata... but meanwhile we should still recommend wikidata over GAZ

matentzn commented 3 years ago

Ok! Sounds good! Happy with inactive as a compromise for now, but I would like to make it explicit somewhere that active ontologies in the OBO foundry should be logically consistent. Its just so annoying that I cant process it.. Would you at the very least agree if I made a PR to get rid of all the disjointness constraints?

lschriml commented 3 years ago

Hello Nico, Unfortunately, there is no active development on the GAZ, although a number of us did work on it together a couple of years ago. The GSC will not be removing the GAZ specification for the MIxS standard. And wikidata is not interested in importing the GAZ. Instead, I would like to make the default file a country specific only file. Becky @beckyjackson made a start on this, creating country subsets of the GAZ

I would be interested in collaborating on this with you Nico, to get GAZ up and running again. Let's have a chat and discuss plans.

Cheers, Lynn

On Tue, Mar 9, 2021 at 5:47 AM Nico Matentzoglu notifications@github.com wrote:

Ok! Sounds good! Happy with inactive as a compromise for now, but I would like to make it explicit somewhere that active ontologies in the OBO foundry should be logically consistent. Its just so annoying that I cant process it.. Would you at the very least agree if I made a PR to get rid of all the disjointness constraints?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/EnvironmentOntology/gaz/issues/35#issuecomment-793703102, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABBB4DOV23JQX5RI4NOZ7ZTTCX4EXANCNFSM4VKCZ2CA .

-- Lynn M. Schriml, Ph.D. Associate Professor

Institute for Genome Sciences University of Maryland School of Medicine Department of Epidemiology and Public Health 670 W. Baltimore St., HSFIII, Room 3061 Baltimore, MD 21201 P: 410-706-6776 | F: 410-706-6756 lschriml@som.umaryland.edu