TheJacksonLaboratory / ExperimentalModelSchema

Experimental Model Schema
https://thejacksonlaboratory.github.io/ExperimentalModelSchema/
MIT License
1 stars 0 forks source link

ESM element for treatments undefined #17

Open hansenp opened 1 year ago

hansenp commented 1 year ago

In order to interpret a measurement correctly, it is crucial to know if any treatments were applied at the time of measurement, and if so, which treatments and over what period of time. Currently, treatments are insufficiently represented in the MPD schema (Issue #16) and an EMS element for treatments has not yet been defined.

The counterpart of an element for Treatments of mice would be the MedicalAction element in the GA4GH Phenopacket schema. Since the MedicalAction element is focused on modelling patients in a clinical context, it is only partially suitable for modelling experimental mouse data. I therefore propose to introduce a new element Treatments into the EMS. A message for a high-fat high sucrose diet applied from 4 to 26 weeks of life might look like this:

treatments:
- Diet:
   ageRange:
     start:
       iso8601duration: "P4W"
     end:
       iso8601duration: "P26W"
   type:
     id: "ONS:1000041" # Ontology for Nutritional Studies (ONS)
     label: "High fat diet" # No term for 'High fat high sucrose diet' in ONS
   schedule:
   - quantity:
       unit:
         id: "UO_0000021"
         label: "gram"
       value: 2.9072 # Average food intake per day
     frequency:
       id: "ONS:1000059"
       label: "Daily food intake"
   - composition:
       fat: 23.2
       carbohydrate: 47.6
       protein: 17.3
       kCalPerG: 4.7

For other types of Treatments, such as drug administration, other sub-elements could be introduced alongside Diet, such as Agent, where specific information such as the name of the agent and the route of administration could be documented.

I would like to point out that the data on treatments in MPD is represented in an unstructured form, which makes it impossible to automatically generate such messages as shown above (Issue #16). However, there is a wide range of well-documented treatments in the MPD, which may be helpful in modeling an adequate schema.

pnrobinson commented 1 year ago

I agree with the overall strategy as sketched out above. Probably it would be good to have more detail for composition. In protobuf, the floats always will be there with a default value of 0.0, I think, and so it is hard to distinguish between zero and not-available (both would be shown as 0.0).

Maybe it would be good to get some of the data from MPD to set up a test case for populating this element. It should be possible to record this kind of information in a structured way moving forward, although it will be very difficult to parse this reliably from legacy data...

sbello commented 7 months ago

It could be worth looking at the XCO from RGD. This has many treatment terms including a large number for different diets (see for example http://purl.obolibrary.org/obo/XCO_0000014)

Also, look at the Alliance LinkML model for capturing this data which is designed to be fairly flexible to accomodate varying degrees of specificity. LinkML yaml file: https://github.com/alliance-genome/agr_curation_schema/blob/main/model/schema/phenotypeAndDiseaseAnnotation.yaml