QIICR / dcmqi

dcmqi (DICOM for Quantitative Imaging) is a free, open source C++ library for conversion between imaging research formats and the standard DICOM representation for image analysis results
https://qiicr.gitbook.io/dcmqi-guide/
BSD 3-Clause "New" or "Revised" License
240 stars 61 forks source link

Normative stats in DICOM-SR .json #305

Closed EmilyLindemer closed 7 years ago

EmilyLindemer commented 7 years ago

Hi @fedorov, this issue is in regards to our recent meeting about normative statistics for neuroimaging applications.

I was using this template to create my own json file to create DICOM-SR's, but doesn't cover my specific use case.

Lines 134-186 of the template linked above show how to include structural statistics such as min, max, and stddev for a given structure at the individual level. We would like to include individual-based statistics as well as group-based statistics.

For example, if we are creating a DICOM-SR that reports regional brain cortical thickness values, then for each structure we would want to include the following:

  1. Minimum thickness of the structure at the individual level
  2. Maximum thickness at the individual level
  3. Mean thickness at the individual level
  4. Std dev of thickness across the individual's structure
  5. Minimum thickness of the structure based on a large group of age-matched individuals
  6. Maximum thickness based on group
  7. Mean based on group
  8. Std dev based on group
  9. Z-score of individual's thickness compared to the group mean distribution for thickness

Importantly, we want to ensure that the values are interpreted properly by a clinician. In other words, make sure that they don't think that the value reported for (7) above is the individual's mean thickness.

The paper attached below demonstrates how we plan to derive normative statistics for each age range and brain structure, and it is these values that we are looking to include in our DICOM-SR along with individual-level statistics.

Thank you!

Potvin2016.pdf

fedorov commented 7 years ago

@EmilyLindemer excellent - thank you for this summary! We will discuss with @dclunie and follow up in this issue.

fedorov commented 7 years ago

@EmilyLindemer we did discuss this with @dclunie, and I forwarded him this question. I will use this issue to continue the discussion and provide updates.

One question that came up is what do you mean by "based on a large group of age-matched individuals"?

Do you mean that

  1. you would compare to an existing normative group defined in a reference, such as the one you included from Potvin et al.,

or

  1. did you mean you intend to segment a large group of individuals, and then report minimum thickness over the cohort of those individuals?

(1) is suitable for DICOM encoding, while (2) is a cohort-based measurement, and would not fit into the existing DICOM model (if I understood correctly the various discussions we had with David before).

Please let us know.

EmilyLindemer commented 7 years ago

@fedorov @dclunie thank you so much for discussing this! You are correct with (1). We are creating our own database, similar to what was done in Potvin, and will pull values from this database, based on the patient's age.

For example, if a patient is 35 years old, we will reference our database to determine what the mean hippocampal volume is for a 35-year-old, and we would like to put this value in the patient's DICOM-SR along with their own hippocampal volume. We'll also probably include other values such as the patient's hippocampal volume percentile based on this database.

All calculations will be done independently of creating this DICOM-SR, we're mainly concerned with how to populate the DICOM-SR with different types of statistics that are derived from our external database, particularly in such a way that a clinician can understand that these values are not about the individual patient, but rather, represent 'normal range' values.

fedorov commented 7 years ago

@EmilyLindemer we talked with @dclunie today about this topic.

Can you take a look at these tables:

We think we can find a place to put these measurements into TID1500, but first we need to make sure we have the terms needed to describe them.

EmilyLindemer commented 7 years ago

@fedorov and @dclunie thank you again for putting thought into this issue!

Those tables appear to have exactly what we need, particularly CID226, as the main results we are looking to report are mean population values as well as 5th and 95th percentiles of the population.

One thing that isn't listed in either of those tables is a code for the individual's percentile ranking based on the normative distribution. I believe that Code Value 121415 listed here may be what we're looking for, if you can confirm. This maps to the same Code Value listed in CID227 which appears to refer to an individual patient rather than a population distribution.

Would I be correct in assuming that the following is the correct way to encode the mean volume (55.987 mm^3) of a structure from a population?

      {
        "value": "55.987",
        "quantity": {
          "CodeValue": "G-D705",
          "CodingSchemeDesignator": "SRT",
          "CodeMeaning": "Volume"
        },
        "units": {
          "CodeValue": "mm3",
          "CodingSchemeDesignator": "UCUM",
          "CodeMeaning": "Cubic millimeter"
        },
      "derivationModifier": {
        "CodeValue": "R-00317",
        "CodingSchemeDesignator": "SRT",
        "CodeMeaning": "Mean Value of Population"
      }
    },
fedorov commented 7 years ago

encode the mean volume (55.987 mm^3) of a structure from a population

@EmilyLindemer I am a bit confused. I thought you wanted to encode the relationship of the measurement on the individual level to the population stats, but from the example above, it looks like you are trying to encode the mean over a population? The latter is not supported by DICOM, since any item you are saving should refer to a specific study/patient.

I am really sorry for the delays in replying.

EmilyLindemer commented 7 years ago

@fedorov we want to include both patient-specific metrics (e.g. their left hippocampal volume) as well as their metrics in relation to the population (e.g. what is the population mean hippocampal volume, and what is the patient's percentile?)

In the example above, I did not include the subject's own mean hippocampal volume nor the percentile component, because I was most concerned with understanding how to encode the population's mean volume.

dclunie commented 7 years ago

Hi Emily, Andrey

Short answer:

NUM "left hippocampal volume" = 36.5 mm3 (TID 1419)
  > HAS PROPERTIES NUM "Mean Value of population" = 55.987 mm3 (TID 310/311/CID 226)
  > HAS PROPERTIES NUM "Percentile Ranking of measurement" = 17 % (TID 310/311/CID 227)
  > HAS PROPERTIES TEXT "Population description" = "Emily's reference population of normals" (TID 310/311)

Long answer:

There is no rule in DICOM that every number has to be about the patient, rather than about the population against the patient is being compared, and the normal range lower and upper limit are an example of that.

It is just a question of assuring that both the proper structure and codes are used so that it is clear to the recipient what is going on.

Emily you are correct that (121415, DCM, "Percentile Ranking of measurement") is appropriate, but where it gets used is important too.

So, to the extent that any measurement template like TID 1419 ROI Measurements:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_A.html#sect_TID_1419

also includes additional sub-templates like TID 310 Measurement Properties:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_A.html#sect_TID_310

which includes TID 311 Measurement Statistical Properties:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/chapter_A.html#sect_TID_311

then we just need to be sure that the value set that is CID 221 Measurement Range Concepts:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_221.html

which in turn includes CID 226 Population Statistical Descriptors:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_226.html

and CID 227 Sample Statistical Descriptors:

http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_CID_227.html

includes the population descriptors that you want, as well as the sample relative descriptors that you want.

I.e, we can encode:

  NUM "left hippocampal volume" = 36.5 mm3 (TID 1419)
  > HAS PROPERTIES NUM "Mean Value of population" = 55.987 mm3 (TID 310/311/CID 226)
  > HAS PROPERTIES NUM "Percentile Ranking of measurement" = 17 % (TID 310/311/CID 227)
  > HAS PROPERTIES TEXT "Population description" = "Emily's reference population of normals" (TID 310/311)

Note that the structure is important; these qualifiers are not siblings of the patient measurement, they are CHILDREN of it (with a HAS PROPERTIES relationship).

I.e., supporting this construct in dcmqi requires a change to the JSON schema and the code that interprets it and translates it to DICOM SR.

David

dclunie commented 7 years ago

Note that the leading ">" in the sibling content items was dropped (presumably due to confusion with the quoting previous message convention); i.e., it was meant to display as:

 NUM "left hippocampal volume" = 36.5 mm3 (TID 1419)
 > HAS PROPERTIES NUM "Mean Value of population" = 55.987 mm3 (TID 310/311/CID 226)
 > HAS PROPERTIES NUM "Percentile Ranking of measurement" = 17 % (TID 310/311/CID 227)
 > HAS PROPERTIES TEXT "Population description" = "Emily's reference population of normals" (TID 310/311)

using the dcsrdump convention for display of DICOM SR.

David

fedorov commented 7 years ago

@dclunie thank you for the explanation. Emily, let us know that this meets your needs, and we can add the feature to the schema/converter.

EmilyLindemer commented 7 years ago

@fedorov @dclunie thank you so much for the thorough responses. The explanation from @dclunie makes sense to me and I believe that this schema is what we are looking for. We are also highly concerned that it is clear that certain metrics are from the patient and are distinct from metrics from the population as this is a product that we intend to be used and interpreted by clinicians. I believe that the schema where the population metrics are encoded as children of the individual's metric is appropriate as described by @dclunie and we would be greatly appreciative of a schema change to reflect this.

fedorov commented 7 years ago

@jriesmeier since TID 310 is not implemented, I was thinking to add this functionality by iterating over the measurements from TID 1419, and manually adding the modifiers using this kind of approach:

  {
    DSRDocumentTree &st = doc.getTree();
    size_t nnid   = st.gotoAnnotatedNode("TID 1419 - Row 5");
    while(nnid){
      cout << "TID1419 - Row 5 is " << nnid << endl;
      nnid = st.gotoNextAnnotatedNode("TID1419 - Row 5");
    }
  }

However, this only works for the first measurement. Is this expected?

jriesmeier commented 7 years ago

@fedorov I haven't checked your code but looking at it, I would first try to insert the missing space character into the second annotation text (i.e. "TID 1419 - Row 5" instead of "TID1419 - Row 5").

fedorov commented 7 years ago

Indeed, sorry for my sloppiness. Thank you Jörg!

fedorov commented 7 years ago

@EmilyLindemer I am sorry this is taking so long. I started working on this, but this is the busiest time of year, and there are some extra deadlines in this particular year....

fedorov commented 7 years ago

@EmilyLindemer the feature discussed above is added.

You can see an example of populating numeric properties and population description here: https://github.com/QIICR/dcmqi/blob/master/doc/examples/sr-tid1500-ct-liver-example.json#L132-L160.

The relevant section of the resulting SR document for that sample JSON looks as the following:

  <contains CONTAINER:(,,"Imaging Measurements")=SEPARATE>
    <contains CONTAINER:(,,"Measurement Group")=SEPARATE>
      <has obs context TEXT:(,,"Activity Session")="1">
      <has obs context TEXT:(,,"Tracking Identifier")="Measurements group 1">
      <has obs context UIDREF:(,,"Tracking Unique Identifier")="1.3.6.1.4.1.43046.3.1.4.0.45168.1510344902.854367">
      <contains CODE:(,,"Finding")=(T-D0060,SRT,"Organ")>
      <has obs context TEXT:(,,"Time Point")="1">
      <contains IMAGE:(,,"Referenced Segment")=(SG image,,1)>
      <contains UIDREF:(,,"Source series for segmentation")="1.2.392.200103.20080913.113635.1.2009.6.22.21.43.10.23430.1">
      <has concept mod CODE:(,,"Finding Site")=(T-62000,SRT,"Liver")>
      <contains NUM:(,,"Attenuation Coefficient")="37.3289" ([hnsf'U],UCUM,"Hounsfield unit")>
        <has concept mod CODE:(,,"Preprocessing operation")=(A.14,99TEST,"Top secret filter")>
        <has concept mod CODE:(,,"Preprocessing operation b")=(A.14b,99TEST,"Top secret filter b")>
        <has concept mod CODE:(,,"Derivation")=(R-00317,SRT,"Mean")>
        <inferred from NUM:(,,"Parameter A")="9.91" (A.16,99TEST,"Cats")>
        <inferred from NUM:(,,"Parameter B")="2" (A.18,99TEST,"Elephants")>
        <inferred from NUM:(,,"Parameter Z")="1" (A.20,99TEST,"Apples")>
        <has properties TEXT:(,,"Population description")="Elves of the Black Forest">
        <has properties NUM:(,,"Elf average weight")="10" (kg,UCUM,"kilogram")>
        <has properties NUM:(,,"Elf average height")="11" (cm,UCUM,"centimeter")>
      <contains NUM:(,,"Attenuation Coefficient")="-778" ([hnsf'U],UCUM,"Hounsfield unit")>
        <has concept mod CODE:(,,"Derivation")=(R-404FB,SRT,"Minimum")>
        <has properties TEXT:(,,"Population description")="Elves of the Black Forest 2">
        <has properties NUM:(,,"Elf average weight")="20" (kg,UCUM,"kilogram")>
        <has properties NUM:(,,"Elf average height")="21" (cm,UCUM,"centimeter")>
      <contains NUM:(,,"Attenuation Coefficient")="221" ([hnsf'U],UCUM,"Hounsfield unit")>
        <has concept mod CODE:(,,"Derivation")=(G-A437,SRT,"Maximum")>
      <contains NUM:(,,"Attenuation Coefficient")="59.1691" ([hnsf'U],UCUM,"Hounsfield unit")>
        <has concept mod CODE:(,,"Derivation")=(R-10047,SRT,"Standard Deviation")>
      <contains NUM:(,,"Volume")="70361.9" (mm3,UCUM,"cubic millimeter")>
      <contains NUM:(,,"Volume")="70.3619" (cm3,UCUM,"cubic centimeter")>

Please reopen the issue if something is not right!