Closed kaiiam closed 2 years ago
I guess there might be some cases where we might need to refer to atoms like in the case of stable isotope measurements e.g., carbon-14 atom
Porting this over from @cmungall
@kaiiam - can you provide more details? (IMHO it's good to do this via tickets on the respective ontology tracker)
It would be good to get a definitive answer with scientific justification for the choice. We need to clearly document this across multiple OBO ontologies that need to represent things at the level of elements (@diatomsRcool @matentzn @bpeters42)
As it happens, I think the molecular entity choice is a good one, because it groups ions, and sometimes these forms are more physiologically relevant.
However, it would be good if the choice we made scientifically rather than just 'this seems to give us the inferences we need'
This part of chebi has always confused me, we have a has-part between the molentity and the atom. This implies to me that the molentity is molecule with multiple atoms but this is now what we want here... see
@cmungall Here is my exchange with Adnan Malik (posted with permission)
ME:
Hello CHEBI curation team,My name is Kai Blumberg I’m a developer for the OBO ontology ENVO. We are making use of CHEBI within our ontology to create terms representing concentrations of chemical entities within environmental materials, e.g., concentration of ammonium in water.I have a question regarding the intended use of the CHEBI molecular entity and atom hierarchies, specifically in regard to the distinction between an atom e.g., cadmium atom and the elemental form of that same atom, e.g., elemental cadmium. Which term would be the correct one to use when referring to a measurement of Cd? I would suspect elemental cadmium would be more appropriate, but I’m not sure. In this and similar cases, the elemental terms do not always have annotation properties such as molecular weight, whereas the atom terms do. Do terms like elemental cadmium represent a portion of Cd atoms which don’t have a fixed formula, net charge, average mass, etc, and therefore don't have annotation properties?Much appreciated if you could help us sort this out. Cheers, Kai
Adnan
Hi Kai, Thank you for your recent e-mail. I'm not quite sure what ENVO are measuring. I guess that the total amount of cadmium (in an organism, soil or water sample) is being measured. This may be present as elemental cadmium, or as one or more cadmium salts. But presumably whatever it is, the results are converted to elemental cadmium equivalent (otherwise reporting finding 0.1 g/litre of cadmium iodide in one study and 0.1 g/litre of cadmium chloride in another study will be misleading, since iodide will make up a much greater proportion (and hence cadmium will make up a much smaller proportion of the total mass than chloride.
So on that basis, I would suggest that ENVO use elemental cadmium (CHEBI:37249). I have added a structure, definition and some some more information to the entry. However, you are right that alot of the elements in ChEBI do not have properties (such as mass, monoisotopic mass etc) associated with them. Lets take elemental carbon (CHEBI:33415) as an example, there are several different forms of elemental carbon that can be found such as diamond (CHEBI:33417) or graphene (CHEBI:36973). The mass and monoisotopic masses of these different forms of carbon will vary hence it would be misleading to assign a mass, monoisotopic mass to this entry. Best Regards, Adnan
ME:
Thank you very much Adnan for the clarification. As I understand it, the measurements we're trying to represent are the results of processes converting an element like Cd to its elemental form and not measuring the associated salts. My question was more in regard to the use of the atom vs the element term from CHEBI, but it sounds like elements form the molecular entity hierarchy are what we should be using.
Thank you also for responding about the CHEBI properties regarding elements, which as I suspected could take on different forms, hence not assigning masses.
Subsequently to this conversation they updated elemental cadmium to include additional annotation properties such as average mass
Let me know if you guys think we should commit to just using terms from the CHEBI molecular entity hierarchy.
I think it just depends on what we're talking about. Both are valid.
I guess there might be some cases where we might need to refer to atoms like in the case of stable isotope measurements e.g., carbon-14 atom
Yes, and I'm sure there will be more cases like this.
CHEBI's treatment of "molecular" is bizarre to me - molecules have two or more atoms, and an ion can be monoatomic.
As I understand it, the measurements we're trying to represent are the results of processes converting an element like Cd to its elemental form and not measuring the associated salts.
That's not the case - many (even most) measurement processes do not include a step to convert stuff into elemental forms, but you can calculate the mass (and thus concentration) from the concentrations of the molecules bearing the atom of interest.
Just my 2 cents.....from the ECTO perspective, the use cases we've encountered at the moment call for elemental, rather than atomic. I can see wanting to represent atoms when talking about molecular reactions, but we haven't encountered that need yet. HOWEVER when building the environmental qualities classes you may need to use the atom IF the analysis is measuring the concentration of the atom in seawater, for example.
The Arctic Data Center's use cases were primarily elemental analyses, so good catch. @mpsaloha might want to weigh in on this issue.
cadmium and friends:
note the ions are not connected to the molecular entity / atom branch. But the ion form may be more physiologically relevant?
@kaiiam yu mentioned ammonia, here is ammonia in the context of nitrogen, so following the proposed ppattern nitrogen-in-soil superClassOf ammonia-in-soil:
@cmungall isn't multiple inheritance under the ion
and molecular entity
hierarchies desirable? Ionic forms are physiologically relevant and often measured, e.g. concentration of ammonium in soil, where ammonium is subsumed under both the ion
and molecular entity
hierarchies, same with phosphate(3-).
so following the proposed pattern nitrogen-in-soil superClassOf ammonia-in-soil:
Wouldn't it be better to have nitrogen molecular entity in soil
? Because that would actually be super class to ammonia-in-soil
, whereas nitrogen atom in soil
wouldn't, again coming back that that has part relation between the atom and molecular entity.
isn't multiple inheritance under the ion and molecular entity hierarchies desirable?
Yes, MI is usually a good thing. Some ontologists have muddied the waters here.
Ionic forms are physiologically relevant and often measured, e.g. concentration of ammonium in soil, where ammonium is subsumed under both the ion and molecular entity hierarchies, same with phosphate(3-).
Yes, I didn't make my point clearly, this is desired
Wouldn't it be better to have nitrogen molecular entity in soil
Yes, I was proposing to use the NME class. I don't know what should be used in the label since NME is not very intuitive
@pbuttigieg
CHEBI's treatment of "molecular" is bizarre to me - molecules have two or more atoms, and an ion can be monoatomic.
I agree
Chiming in as Chris called me out: For OBI, we have avoided importing logical axioms from Chebi, because the inferences coming from that don't match our understanding of science / chemistry / physics. It would be great to have a way to use the wealth of knowledge in Chebi in a compatible way with OBO ontologies, and to coordinate developing that.
On Sun, Jun 21, 2020 at 7:01 PM Chris Mungall notifications@github.com wrote:
isn't multiple inheritance under the ion and molecular entity hierarchies desirable?
Yes, MI is usually a good thing. Some ontologists have muddied the waters here.
Ionic forms are physiologically relevant and often measured, e.g. concentration of ammonium in soil, where ammonium is subsumed under both the ion and molecular entity hierarchies, same with phosphate(3-).
Yes, I didn't make my point clearly, this is desired
Wouldn't it be better to have nitrogen molecular entity in soil
Yes, I was proposing to use the NME class. I don't know what should be used in the label since NME is not very intuitive
@pbuttigieg https://github.com/pbuttigieg
CHEBI's treatment of "molecular" is bizarre to me - molecules have two or more atoms, and an ion can be monoatomic.
I agree
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/EnvironmentOntology/envo/issues/977#issuecomment-647224687, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJX2IRPUFLOT3IDEH2E7LLRX23PPANCNFSM4OCL5Z7Q .
-- Bjoern Peters Professor La Jolla Institute for Allergy and Immunology 9420 Athena Circle La Jolla, CA 92037, USA Tel: 858/752-6914 Fax: 858/752-6987 http://www.liai.org/pages/faculty-peters
Porting over @pbuttigieg 's comment from here
Following our discussion of using atom vs molecular entity terms, I emailed the CHEBI team and they suggested we use the molecular entity terms. Hence I think we should be consistent and not use these atom terms here.
@kaiiam I'm not sold on this yet
@pbuttigieg Due to the parallel atom and molecular entity CHEBI hierarchies joined via a has part relation, I'm worried that if we allow ourselves to create concentration terms from both, it could hinder rather then help with interoperability. E.g. curators annotating their data would have to pick between terms like concentration of cadmium atom
and concentration of cadmium molecular entity
. If we allow ourselves to create both, then we won't be helping to make disparate datasets interoperable but instead we would be separating data by the curation choice of confusingly similar concentration terms. Hence my advocacy for us only using one CHEBI hierarchy when possible.
CHEBI's treatment of "molecular" is bizarre to me - molecules have two or more atoms, and an ion can be monoatomic.
Perhaps atom terms are preferable to molecular entity terms when describing measurements of elemental forms like in @diatomsRcool's, @stevenchong's, and the UA-SRC use-cases. I'm not sure what's "better" I just want us to be consistent.
The argument the other way is that since ions are subsumed under the molecular entity hierarchy a recursive subclass query for all subclasses of a molecular entity term, e.g. aluminum molecular entitity, would give us the ions as well:
However, due to inconsistencies in CHEBI, this doesn't always seem to hold, e.g. with cadmium where the linkages that would enable us to query and discover cadmium cations are missing.
In conversation with @pbuttigieg and @wdduncan, we're thinking of favoring the use of the molecular entity
branch over the atom
branch for the majority of cases. Although both are correct, using terms from the molecular entity
hierarchy seems more pragmatic as it contains the various valence states and ions, which people require see the sulfur molecular entity
for example:
In contrast, the atom branch contains the various isotopes the element can have.
Thus a potential solution would seem to be 1) Use terms from the molecular entity hierarchy for the majority of cases, and to be as specific as possible e.g. aluminum(3+)
instead of aluminum cation
2) Make use of the atom hierarchy when specifically describing measurements of isotopes e.g. carbon-14 atom
.
I would be cautious about using overly specific ion forms. Again it comes down to do you get the inferences you expect? You might want to write up some competency questions
for GO we use the pH7.3 form ion subtype which represents "normal" physiology in kind of metazoa biased kind of way. No idea if that translates to e.g soil, seawater. But if you use the ME class it should give a lot of what you needd
@cmungall
I would be cautious about using overly specific ion forms. Again it comes down to do you get the inferences you expect? You might want to write up some competency questions
If a method reports on the concentration of a specific ionic form, I think we should use that class regardless of what inferences come out. The ontology should be driven by reality. However, as @kaiiam and @ramonawalls will be using this branch for their work, they may want to explore this via competency questions more closely.
for GO we use the pH7.3 form ion subtype which represents "normal" physiology in kind of metazoa biased kind of way. No idea if that translates to e.g soil, seawater.
It could translate to other environmental materials other than tissues/cells, but we wouldn't really know. What we would know is that a specific method is reporting on, e.g., the concentration of nitrate (or nitrite, or sulphate, etc). That's enough to build a corresponding class I think.
But if you use the ME class it should give a lot of what you needd
Yes, we're not going to be able to resolve CHEBI's ambiguity on ME:
Any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity.
Atom should be an ME, if that definition is right, and the "etc." isn't really helpful.
So, as @kaiiam notes above:
I think the latter is more "correct" but I'm not sure it's worth the complexity given the other issues with these hierarchies.
@stevenchong
The Arctic Data Center's use cases were primarily elemental analyses, so good catch. @mpsaloha might want to weigh in on this issue.
Please be sure about that - make sure the methods are actually reporting on a quantity of matter where all the atoms have the same atomic number.
@diatomsRcool
Just my 2 cents.....from the ECTO perspective, the use cases we've encountered at the moment call for elemental, rather than atomic.
This should work out then - the atomic forms are linked via parthood to the MEs.
I can see wanting to represent atoms when talking about molecular reactions, but we haven't encountered that need yet.
That should likely be in a separate ontology. I know there were some that dealt with rxns, but not sure if they're maintained.
HOWEVER when building the environmental qualities classes you may need to use the atom IF the analysis is measuring the concentration of the atom in seawater, for example.
This is the confusing bit - I think that saying "oxygen molecular entity" would cover most forms of atoms themselves (because of CHEBI's very inclusive ME def and the presence of the "elemental" classes under ME) and also allow a bit of fuzziness in case the method is measuring different forms of the chemical entity.
Do you know how ChEBI formally defines 'part of'? Sometimes, 'part of' is defined a being reflexive, so every oxygen atom is part of itself. If ChEBI defines part of in this way, I could see how atoms are subsumed under molecular entity.
@wdduncan
Looking at oxygen molecular entity it looks like they use the BFO:'has part'.
Looking at oxygen atom, they don't use a 'part of' relation, as the 'has part' does the work.
So they're quite reasonably (pun) linked
@pbuttigieg I should have been more careful in my language. 'has part' is just the inverse of 'part of'. See here: http://www.ontobee.org/ontology/RO?iri=http://purl.obolibrary.org/obo/BFO_0000051
On the ontobee page, it only specifies that has part is transitive. The OWL maybe different.
@wdduncan I'd post on their tracker with this question, cross-linking to this one.
@pbuttigieg
Done. See https://github.com/ebi-chebi/ChEBI/issues/3813
Any constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer etc., identifiable as a separately distinguishable entity. Atom should be an ME, if that definition is right, and the "etc." isn't really helpful.
FWIW, although CHEBI do not provide provenance for their definitions, this comes from IUPAC:
https://goldbook.iupac.org/terms/view/M03986
Although IUPAC is authoritative, this does not mean it is a good source of ontology definitions. I think IUPAC should be used to define CHEBI metaclasses, not classes. This class/metaclass confusion has persisted throughout chebi leading to problems such as the ones pointed out in this tracker. I have been pointing this out in the chebi tracker since 2007 to no avail.
This thread is so long already, I almost hate to add to it, but...
As I understand it, the measurements we're trying to represent are the results of processes converting an element like Cd to its elemental form and not measuring the associated salts.
That's not the case - many (even most) measurement processes do not include a step to convert stuff into elemental forms, but you can calculate the mass (and thus concentration) from the concentrations of the molecules bearing the atom of interest.
As far as I understand, that is exactly what the environmental scientists do when reporting metal concentrations. I think for metals as contaminants, it is fairly standard.
Overall, I am very happy with where this thread is converging. I have also asked a colleague from Dartmouth who is processing their environmental data to comment.
@ramonawalls
As far as I understand, that is exactly what the environmental scientists do when reporting metal concentrations. I think for metals as contaminants, it is fairly standard.
I didn't realise / missed that your use case is restricted to metals. I was referring to compounds in general.
However, I still think we should work with molecular entity, as the actual quality we're talking about may not inhere in (only) the elemental form of the metal in the soil/water/etc, even if the method of measurement converts things into elemental forms.
Overall, I am very happy with where this thread is converging. I have also asked a colleague from Dartmouth who is processing their environmental data to comment.
Cool, many thanks
xref to this commit from Chris's new chemistry-ontology to which he intended to tag to this issue.
@ramonawalls suggested I be really clear about the use of CHEBI
molecular entity
andatom
hierarchies, in regard to which would be the correct one to use when referring to measurements in our concentration terms. @cmungall mentioned it would be better to be consistent and just use one.Hence I asked the CHEBI team and they responded saying we should use terms from
molecular entity
terms (e.g. elemental cadmium) instead ofatom
terms (e.g. cadmium atom). As such I think we should be consistent when importing CHEBI terms for use in our DOSDPs.I noticed this issue in some of the concentration terms @stevenchong and I had made to address #721, e.g.
ENVO:3200027
which is set to include lanthanum atom.I also noticed this in this ongoing PR @pbuttigieg is working on, where I notice the addition of terms like
ENVO:3100043,,CHEBI:27594,carbon atom,ENVO:00002149,sea water
Finally I also noticed the use of atom terms in the entity_attribute_location pattern, e.g. solubility of nitrogen atom in water.
Let me know if you guys think we should commit to just using terms from the CHEBI
molecular entity
hierarchy.