Open hansenp opened 1 year ago
I agree with the overall strategy as sketched out above. Probably it would be good to have more detail for composition. In protobuf, the floats always will be there with a default value of 0.0, I think, and so it is hard to distinguish between zero and not-available (both would be shown as 0.0).
Maybe it would be good to get some of the data from MPD to set up a test case for populating this element. It should be possible to record this kind of information in a structured way moving forward, although it will be very difficult to parse this reliably from legacy data...
It could be worth looking at the XCO from RGD. This has many treatment terms including a large number for different diets (see for example http://purl.obolibrary.org/obo/XCO_0000014)
Also, look at the Alliance LinkML model for capturing this data which is designed to be fairly flexible to accomodate varying degrees of specificity. LinkML yaml file: https://github.com/alliance-genome/agr_curation_schema/blob/main/model/schema/phenotypeAndDiseaseAnnotation.yaml
In order to interpret a measurement correctly, it is crucial to know if any treatments were applied at the time of measurement, and if so, which treatments and over what period of time. Currently, treatments are insufficiently represented in the MPD schema (Issue #16) and an EMS element for treatments has not yet been defined.
The counterpart of an element for Treatments of mice would be the
MedicalAction
element in the GA4GH Phenopacket schema. Since theMedicalAction
element is focused on modelling patients in a clinical context, it is only partially suitable for modelling experimental mouse data. I therefore propose to introduce a new elementTreatments
into the EMS. A message for a high-fat high sucrose diet applied from 4 to 26 weeks of life might look like this:For other types of
Treatments
, such as drug administration, other sub-elements could be introduced alongsideDiet
, such asAgent
, where specific information such as the name of the agent and the route of administration could be documented.I would like to point out that the data on treatments in MPD is represented in an unstructured form, which makes it impossible to automatically generate such messages as shown above (Issue #16). However, there is a wide range of well-documented treatments in the MPD, which may be helpful in modeling an adequate schema.