OBOFoundry / OBOFoundry.github.io

Metadata and website for the Open Bio Ontologies Foundry Ontology Registry
http://obofoundry.org
Other
166 stars 204 forks source link

Principle #12 "Naming conventions" #958

Open nataled opened 5 years ago

nataled commented 5 years ago

Current wording:

NOTE

The content of this page is scheduled to be reviewed. Improved wording will be posted as it becomes available.

Details

For full details, see this paper: http://www.biomedcentral.com/1471-2105/10/125

Briefly, some important things to remember:

nataled commented 5 years ago

Wording suggestions and discussion:

Also see http://msi-ontology.sourceforge.net/namingconventions/Naming_Conv_v14.htm

Sez Melissa: Naming is complicated as per the referenced documentation. Do you think it is possible to provide a high level summary here so that we can use this for the purposes of review? Basically a short list of requirements and consistency checks. We agreed.

Name: Naming conventions

Summary and Purpose: the names for classes in an ontology should be intelligible to scientists and be amenable to natural language processing.

Implementation:

[todo: put MUST and SHOULD in caps.]

Here is a list of the headline points from the above links.

Names of classes should as far as possible reflect usage in ordinary scientific body text. They are not titles. This means that words within them should only be capitalized if they are proper names (for example Golgi) multiword names should have the words separated by spaces (not underscores or CamelCase) non-deprecated class names should not contain information relevant only to the ontology, for example “protocol (definition incomplete)”. Reasons for allowing this for deprecated terms include to avoid clashes between classes which would otherwise have the same name, and to provide a readily-usable audit trail. Short, commonly-used names should appear as they do in ordinary running text in the normal order of words and should not contain commas, hence “hoses for the use of firemen”, not “hoses, firemen, for the use of”. Commas may be used where they are part of, say, chemical names, or in long complicated terms (more than four or five words) such as in the Gene Ontology. names should be singular: “plant cell” rather than “plant cells”. If you are referring to a collective or aggregate, then say so. Plurals may be added as synonyms.

Names of classes should be short and memorable, for example “wall of esophagus” rather than “the wall of the esophagus”.

Non-obvious abbreviations should be spelt out, for example “nuclear magnetic resonance” rather than “NMR”. DNA and laser are clear exceptions to this. Abbreviations should be added as synonyms.

MERGE WITH PROPOSED ‘UNIQUE LABEL’ PRINCIPLE: Names should be univocal within a given ontology, for example “nuclear magnetic resonance magnet” rather than “magnet”.

Discussion point: what about names of relations?

cmungall commented 5 years ago

I do not think that linked MSI doc is useful. Way too much detail. On scanning it I see stuff that is not irrelevant that I disagree with. E.g. out of date recommendations to avoid multiple inheritance. my opinions on this.

cmungall commented 5 years ago

"be amenable to natural language processing" --> don't necessarily agree with this. I think they key point is that it should reflect normal scientific terminology. NLP should follow naturally from that.

cmungall commented 5 years ago

I think the text in the 2nd comment is largely redundant. The existing wording is quite clear and succinct IMHO. I see a few useful additional suggestions in the text, that can be added as new bullets:

nataled commented 5 years ago

Add bullet point stating there should be no additional leading or trailing whitespace see #257

nlharris commented 4 years ago

what's the status of this principle?