oborel / obo-relations

RO is an ontology of relations for use with biological ontologies
http://oborel.github.io/
Other
92 stars 46 forks source link

NTR "is matched small molecular pair with" #696

Closed cthoyt closed 1 year ago

cthoyt commented 1 year ago

Chemicals A and B are a matched (small) molecular pair if their chemical structures define by a single, relatively small, well-defined structural modification. In the following example from Wikipedia, these two moleculers are MMPs, with the yellow part being the variable region.

A depiction of a matched molecular pair

Why this is useful

Tools like mmpdb can be used to generate MMP relationships on datasets like ChEMBL in bulk. These can be included in knowledge graphs useful for drug discovery, as MMPs are a discrete alternative to including relationships based on chemical similarity (e.g., cosine similarity) using some featurization (e.g., MACCS keys, ECFP-4) and cutoff.

Properties of this relationship

Summary of Discussion

On the RO call on March 28th, 2023, there was the following feedback:

  1. The label might be misleading in contexts when "molecule" can refer to peptides or other macromolecular entities like proteins. While in the cheminformatics domain, it's standard that MMP corresponds to small molecules, it is helpful to update the label to be more specifi. This was done in 1daf181. However, we can also keep in mind that this relation falls in the "chemical relationship" hierarchy and is therefore implicitly only for small molecules
  2. There was some discussion about additional metadata models that might accompany a relationship in a KG like "A is an MMP with B" such as "A and B share the parent structure C".

    These could appear as annotations in an ontology setting or as edge metadata in a KG scenario. For example, "A is an MMP with B" implies that there is some C such that A RO:0018040 (has parent hydride) C and B RO:0018040 C. This relation is part of the proposal in #698.

wdduncan commented 1 year ago

Suggestion on RO call: the label may need some adjusting to make it less ambiguous.

maybe label as 'matched chemo-molecular-pair'?

might need consider how information about the difference in two molecules is captured.

handemcginty commented 1 year ago

Just to clarify, we talked about whether we need to look at some modeling-related questions regarding the differences in molecular structure and how it could be captured.

cthoyt commented 1 year ago

Just to clarify, we talked about whether we need to look at some modeling-related questions regarding the differences in molecular structure and how it could be captured.

@handemcginty I think if you want to make a knowledge graph, you can use this MMP relationship and potentially qualify it with a reference to the functional parent that they share. We could also in the future go into a direction where there is a relationship between these three things explicitly, but for now I want to keep this PR focused on building up KGs. It's not usually the job of RO to define the data model, though. Usually that shows up in something like LinkML.

For me, using MMPs as an alternate to similarity-based edges (with some cutoff) between small molecules is the main advantage