mapping-commons / mh_mapping_initiative

Repo to organise the mouse-human phenotype mapping initiative and reconcile resources.
13 stars 1 forks source link

Capture curation rules for MP-HP mapping decisions #14

Open matentzn opened 2 years ago

matentzn commented 2 years ago

Can we get a set of (1) rules (under what conditions do you apply a mapping relation) and (2) examples (can you give some examples for when you curate broad/narrow/close etc) for when you are curating:

Some notes from our meetings

sbello commented 2 years ago

Usage of SSSOM terms typically rely on text definition, placement, definition cross-references, and logical definitions to determine relationships

  1. exact:

    • entities referenced are the same even if the species is not
    • qualifiers in the definition are equivalent
    • example: decreased alertness (MP:0020224) and Decreased vigilance (HP:0032044)
  2. narrow

    • in essence the narrower term would be a direct child or grandchild of the other term
    • for MP to HP, we ignore that the HP term is limited to human in these cases
    • common examples are the narrow term specifies a location that the parent does not
    • example: increased circulating gamma-glutamyl transferase level (MP:0031114) and Elevated gamma-glutamyltransferase level (HP:0030948)
  3. broad

    • in essence the broad term wold be a direct parent or at most grandparent to the other term
    • common example of this in MP to HP has been that the HP term has human specific requirements for when the phenotype would be diagnosed that are not present in the MP term and are more restrictive than the common usage of the MP term
    • additional examples are the inverse of the narrow example where the broad term specifies a more general location or no location compared to the other term
    • example: abnormal vocalization (MP:0001529) and Dysphonia (HP:0001618)
  4. close

    • in essence the close term is almost the same as the other term but there are some slight differences
    • calling close vs related vs exact is tricky, for exact vs close we went with the guideline that it may be that if you dug into individual annotations the two terms may be exact but based on the ontologies it is unclear
    • example: decreased circulating citrulline level (MP:0020837) and Low plasma citrulline (HP:0003572) - called close as it is unclear to us if citrulline in circulation is always measured in the plasma or might measurements in whole blood show a difference, is citrulline found in blood components other than plasma? Expectation is that most of the time these are the same but since there is potential for differences we chose close rather than exact
    • example: decreased hemoglobin content (MP:0002874) and Decreased hemoglobin concentration (HP:0020062) - is the distinction between content and concentration meaningful in this context?
  5. related

    • terms are similar but clearly not the same, likely to be closely related siblings in a shared ontology
    • example: decreased blood oxygen saturation level (MP:0031068) and Hypoxemia (HP:0012418), the MP term refers to oxygen bound to hemoglobin as a percentage of maximal capacity, this is clearly related to hypoxemia (low level of blood oxygen) but is not quite the same
    • example: increased anxiety-related response (MP:0001363) and Anxiety (HP:0000739) not exactly the same things but if you are looking for models of anxiety you would want to find annotations to anxiety-related response
sbello commented 2 years ago

When is broad too broad, i.e when do you decide not to map a term vs mapping it to a broader term.

We mapped the term Abnormal ventriculoarterial connection (HP:0011563) to abnormal heart and great artery attachment (MP:0010426) as the MP term cold be the direct parent to the HP term but the MP does not have a specific term for that connection. Basically the terms differ on only one parameter, both are referring to the same type of abnormality in the same type of anatomical structure but the HP terms has been split into more specific structure terms.

Single umbilical artery (HP:0001195) could be mapped to abnormal umbilical artery morphology (MP:0003230) but we may be better served creating a new term since there are so many other ways the umbilical arteries could be abnormal. The broad mapping is likely to often lead to incorrect or misleading associations so we may be better off skipping the association and just making a new term that is a close match.

Basic rule of thumb, if the broad term is so much broader than many of the annotations found by that mapping would not be useful then skip calling it broad and create a new term.

We also consider creating new terms for other broad mappings. Typical rule of thumb is do we think the new term would be useful to the users of the ontology? If yes, then we go ahead and make a new term.

anna-anagnostop commented 2 years ago

Examples of HPO terms that are so human-centric that it does not make sense to create terms for non-human mammals. These include some abnormalities of higher mental function and self-reported, location-specific pain terms/symptoms such as:

Headache HP:0002315, Brain fog HP:0033630, Confusion HP:0001289, Delirium HP:0031258, Loss of Speech HP:0002371, Chest pain HP:0100749, Arthralgia HP:0002829, Abdominal pain HP:0002027 and Neuralgia HP:0033345

anna-anagnostop commented 2 years ago

More "related" examples:

increased facial hemangioma incidence MP:0031292 is related to Facial hemangioma HP:0000329.

Note: In general, the MP deals with most tumor terms as increased tumor type incidence ("greater than the expected number of tumor X, occurring in a specific population in a given time period"), not just bare tumor type terms like the HPO.

decreased food intake MP:0011940 and Anorexia HP:0002039 are related but not quite the same, as the MP term specifically refers to “a reduction in the total number of calories or food amount taken in over time when compared to the normal state” whereas the HPO term refers to “lack or loss of appetite for food”. If you are looking for anorexic mouse models you are likely to find annotations to decreased food intake MP:0011940, as a mouse/rat cannot verbally “express” a loss of appetite for food.

impaired balance MP:0001525 and Vertigo HP:0002321 (exact syn: dizzy spell, related syn: dizziness) are related, as the MP term refers to “reduced ability of an animal to maintain equilibrium”, whereas the HPO term specifies “an abnormal sensation of spinning while the body is actually stationary”. Again, a mouse/rat cannot communicate/verbalize the sensation of a dizzy spell, but this may be reported as impaired balance of the animal in a behavioral assay.

Note: In some cases, the parentage/placement of related terms within each ontology hierarchy may be quite different. For example, decreased food intake MP:0011940 is strictly a behavioral phenotype whereas Anorexia HP:0002039 is an abnormality of the digestive system. Similarly, impaired balance MP:0001525 is under abnormal motor coordination/balance (and thus a behavioral phenotype), whereas Vertigo HP:0002321 is under vestibular dysfunction (and thus an Abnormality of the ear).

nervous MP:0008912 (exact synonyms: easily agitated, skittish) is related to HPO term Agitation HP:0000713 via its synonym. However, the MP term definition specifies “increased skittish behavior induced by stimulation such as handling, touching or noise” whereas the HPO term does not include the stimulation requirement and refers to “a state of exceeding restlessness and excessive motor activity associated with mental distress or a feeling of inner tension”.

anna-anagnostop commented 2 years ago

More "close" examples:

weakness MP:0000746 and Asthenia HP:0025406 are sufficiently close but not identical as the MP term refers to a “state of being infirm or less strong than normal” while the HPO term uses weakness as an exact synonym but incorporates a “feeling” component in its definition (“state characterized by a feeling of weakness and loss of strength leading to a generalized weakness of the body”).

increased pruritus MP:0010072 and Pruritus HP:0000989 are sufficiently close as the HPO term includes “an abnormally increased disposition to experience pruritus” as part of its definition, but only the MP term uses the “increased” qualifier in its label to represent the “abnormal presence or increased intensity” of an itch (relative to a control).

anna-anagnostop commented 2 years ago

More "narrow" examples:

cardiac muscle necrosis MP:0010632 is a narrow match to Muscle fiber necrosis HP:0003713, as the MP “child” term specifies an additional (cardiac) location that the “parental” HPO term does not.

dry gangrene MP:0031127 is a narrow match to Gangrene HP:0100758, as the MP “child” term specifically refers to gangrene “in the absence of superimposed microbial infection”, includes possible causes/consequences as part of its definition, and clarifies usage in the Comment field (“the term dry is used only when referring to a limb or to the gut”).

abnormal facial morphology MP:0003743 is a narrow match to Abnormality of the face HP:0000271, as the HPO term is broader and has descendants that cover physiological abnormalities (e.g. Rhinitis HP:0012384 under Abnormality of the nose, Oral bleeding HP:0040184 under Abnormal oral physiology) in addition to morphological abnormalities.

Note: There are other cases of a known mismatch between the upper level structure of the MP and HPO. The HPO has terms for abnormal X that may cover both morphology and physiology. The MP has X phenotype, then splits into abnormal X morphology and abnormal X physiology branches. The MP does not have abnormal X that covers both.

Example of "narrow" match, where for MP to HPO, we ignore that the HPO term is limited to human: behavioral despair MP:0002573 is a narrow match to Depression HP:0000716, as the MP term refers to “depression assayed by reduced escape attempts and/or immobility when placed in a stressful situation such as a forced swim test or a suspension test; or failure to seek pleasurable stimuli” describing use of rodent-specific behavioral tests to evaluate depressive-like behavior. The HPO term refers to human-specific feelings, moods and thoughts.

anna-anagnostop commented 2 years ago

More "exact" examples:

impaired olfaction MP:0008544 (exact syn: hyposmia) is an exact match to Hyposmia HP:0004409, as the MP refers to “reduced ability to detect odors”, representing the same concept as the HPO term which is defined as “decreased sensitivity to odorants (that is, a decreased ability to perceive odors)”.

dysgeusia MP:0031092 (exact syn: parageusia) is an exact match to Parageusia HP:0031249 (exact syn: Dysgeusia) as both terms essentially refer to altered or distorted taste sensation.

Note: exact matches do not have to share the same primary term label as long as there is a high degree of confidence that the two concepts can be used interchangeably.

anna-anagnostop commented 2 years ago

Examples where review of existing MP annotations indicated that it would be useful to create new terms:

Following review of HPO terms created for the KidsFirst project, the MP added bilateral cryptorchism MP:0031287 and unilateral cryptorchism MP:0031286 as new laterality-specific child terms of cryptorchism MP:0002286 as we determined that this distinction might be useful to MP ontology users. These are now “exact” matches to Bilateral cryptorchidism HP:0008689 and Unilateral cryptorchidism HP:0012741, respectively.

Likewise, the MP added new terms for ambiguous external genitalia MP:0031377 and sex-specific child terms ambiguous female external genitalia MP:0031379 and ambiguous male external genitalia MP:0031378 as exact matches to HPO terms Ambiguous genitalia HP:0000062, Ambiguous genitalia, female HP:0000061 and Ambiguous genitalia, male HP:0000033, respectively.

Also, the MP recently added abdominal situs abnormality MP:0031284 (“an abnormality of the abdominal situs, i.e., of the sidedness of the abdomen and its organs”) as a new parent "grouping" term for existing terms abdominal situs ambiguous MP:0011250 and abdominal situs inversus MP:0011249. The new MP parent term is now an exact match to HPO term Abnormality of abdominal situs HP:0011620.

matentzn commented 2 years ago

From meeting, we need to get a sense whether "increased incidence of cancer A" and "cancer A" is better reflected as a closeMatch vs relatedMatch.

matentzn commented 2 years ago

We should get feedback from users what would be expected.