TheJacksonLaboratory / ExperimentalModelSchema

Experimental Model Schema
https://thejacksonlaboratory.github.io/ExperimentalModelSchema/
MIT License
1 stars 0 forks source link

ExperimentalMeasurements message undefined #14

Open hansenp opened 1 year ago

hansenp commented 1 year ago

So far there is no definition of a message for measurements on individual animals. The section mainly deals with the database schema of MPD and a suggestion is made how measurement methods could be represented as ontologies. However, an EMS message for measurements on individual animals is not defined.

An obstacle to the definition of a message is seen in the fact that the information on measurements is mixed up with a lot of other information such as strain or sex. As a solution, an ontology for measurements is proposed, which in my opinion is not a complete solution to the problems that need to be solved in this context.

The information about the measurements in MPD is project- and partly age-specific, which is why there are more than 18,000 corresponding entries in total. The information about what, when, where and how was measured is only partially available in a structured and uniform form. Here is the example from the documentation reduced to the columns that are relevant for measurements:

"animaldata": [
{
  "measnum": 89013,
  "value": 2887.0,
},

measures_info": [
    {
      "measnum": 89013,
      "projsym": "JaxCC1",
      "units": "cm",
      "ageweeks": "7-9wks",
      "method": "open field test",
      "descrip": "total distance traveled, 20 min test",
      "varname": "distance_total_OF"
    }
  ]

value, units and ageweeks can be represented by existing Phenopacket messages quantity and age. The information about what was measured and how it was measured is mixed up in the descrip field and also in the method and varname fields.

Furthermore, the open field test involves more than one measurement, which is typically also the case for other tests such as the glucose tolerance test. Not every single measurement made in the course of a test is informative on its own. Here, in addition to the measurement information with the ID 89013 (blue box), some other measurement information about the project JaxCC1 and the method open field test (there are 30 in total):

image

For comparison, here is all the measurement information on the project Wahlsten1 and the method open field test:

image

In my opinion, these two examples demonstrate the following:

  1. Tests such as the open field test or the glucose tolerance test are not simple measurements, but typically involve additional test parameters (e.g. duration of the test, size of the area, etc.) and more than one measurement. In some cases, the result of a test could be summarized in a single metric comparable for different tests, in this case, e.g., "percent-at-edge" or "percent-in-center". But in such cases we would still have the problem of automatically selecting the appropriate measurement from the multiple measurements performed for a particular test.

  2. If we also want to represent non-standard tests without comparable metrics in EMS packages in a meaningful way, we would need an extra message for this. However, we haven't even defined a message to represent something as simple as measuring body weight in EMS packets.

  3. In MPD, the information on the measurements is only partially available in a structured and uniform form, so that automatic processing will not be possible without further ado. A normalization of the data can only be done by manual curation, possibly software-assisted. A transformation of the data in its current form into EMS packets does not make sense, since such packets would also be inaccessible for automatic processing.

Here is my suggestion for measurements messages of individual animals:

measurements:
- assay:
    id: "LOINC:3141-9"
    label: "Body weight measured"
  value:
    quantity:
      unit:
        id: "UO_0000021"
        label: "gram"
      value: 17.9
      referenceRange:
        unit:
          id: "UO_0000021"
          label: "gram"
        low: 16.7
        high: 18.9
  timeObserved:
    age:
      iso8601duration: "P5W"
- assay:
    id: "LOINC:3695-4"
    label: "Insulin mass in plasma"
  value:
    quantity:
      unit:
        id: "UO_0000301"
        label: "microgram per liter"
      value: 0.45010
  timeObserved:
    age:
      iso8601duration: "P10W"