NIH-NCPI / ncpi-model-forge

🔥 The Project Forge FHIR model
Apache License 2.0
4 stars 5 forks source link

Profile ResearchStudy to include DocumentReference #29

Open bwalsh opened 3 years ago

bwalsh commented 3 years ago

Requester information

Please provide the following information:

Request Details

Please provide the following information about what you wanting to accomplish with your model change request:

{
  "AnVIL_CMG_Broad_Orphan_Estonia-Ounap_WGS":{
    "referenceData_hg38_ref_fasta":"redacted bucket path",
    "referenceData_hg38_ref_alt":"redacted bucket path",
    "referenceData_hg38_ref_amb":"redacted bucket path",
    "referenceData_hg38_axiomPoly_resource_vcf_index":"redacted bucket path",
    "referenceData_hg38_ref_fasta_index":"redacted bucket path",
    "referenceData_hg38_mills_resource_vcf":"redacted bucket path",
    "referenceData_hg38_one_thousand_genomes_resource_vcf":"redacted bucket path",
    "referenceData_hg38_dbsnp_vcf_index":"redacted bucket path",
    "referenceData_hg38_one_thousand_genomes_resource_vcf_index":"redacted bucket path",
    "referenceData_hg38_call_interval_list":"redacted bucket path",
    "referenceData_hg38_ref_dict":"redacted bucket path",
    "referenceData_hg38_dbsnp_vcf":"redacted bucket path",
    "referenceData_hg38_mills_resource_vcf_index":"redacted bucket path",
    "referenceData_hg38_ref_sa":"redacted bucket path",
    "referenceData_hg38_hapmap_resource_vcf_index":"redacted bucket path",
    "referenceData_hg38_hapmap_resource_vcf":"redacted bucket path",
    "referenceData_hg38_omni_resource_vcf_index":"redacted bucket path",
    "referenceData_hg38_axiomPoly_resource_vcf":"redacted bucket path",
    "referenceData_hg38_eval_interval_list":"redacted bucket path",
    "referenceData_hg38_omni_resource_vcf":"redacted bucket path",
    "referenceData_hg38_scattered_calling_intervals_list":"redacted bucket path",
    "referenceData_hg38_unpadded_intervals_file":"redacted bucket path",
    "referenceData_hg38_ref_pac":"redacted bucket path",
    "referenceData_hg38_ref_bwt":"redacted bucket path",
    "referenceData_hg38_ref_ann":"redacted bucket path"
  },
  "AnVIL_GTEx_V8_hg38":{
    "genome_fasta_index":"redacted bucket path",
    "sample_attributes":"redacted bucket path",
    "subject_phenotypes":"redacted bucket path",
    "transcripts_gtf":"redacted bucket path",
    "genome_fasta_dict":"redacted bucket path",
    "genes_gtf":"redacted bucket path",
    "vcf_index":"redacted bucket path",
    "genome_fasta":"redacted bucket path",
    "rnaseq_genome_fasta_index":"redacted bucket path",
    "rnaseq_genome_fasta":"redacted bucket path",
    "rnaseq_genome_fasta_dict":"redacted bucket path",
    "vcf_file":"redacted bucket path"
  }
}

Example IG

torstees commented 3 years ago

We have been thinking of data like this going into a mix of Tasks, Document References and Observations. Have a look at the diagram here: https://github.com/anvilproject/cmg-data-ingest/issues/1

bwalsh commented 3 years ago

Thank you @torstees

At a high level, agree that we should simplify and try and use 'stock' FHIR definitions where ever possible. Findings from a quick test of linking stock DocumentReference -> ResearchStudy, DocumentReference -> Specimen.

Overview

Concerned that while DocumentReference states "a document of any kind for any purpose.". However the subject field, that maintains a link to a Resource "Who/what is the subject of the document", seems to be overly constricted.

Even with the introduction of the Task, will we be able to efficiently/elegantly note that the document refers to [ResearchStudy, Specimen]

Stock FHIR

"description" : "A reference to a document of any kind for any purpose. Provides metadata about the document so that the document can be discovered and managed.

    {
      "id" : "DocumentReference.subject",
      "path" : "DocumentReference.subject",
      "short" : "Who/what is the subject of the document",
      "definition" : "Who or what the document is about. The document can be about a person, (patient or healthcare practitioner), a device (e.g. a machine) or even a group of subjects (such as a document about a herd of farm animals, or a set of patients that share a common exposure).",

Attempt to set DocumentReference.subject to a either ResearchStudy or Specimen

error The type "Reference(ResearchSubject)" does not match any of the allowed types: Reference(http://hl7.org/fhir/StructureDefinition/Patient | http://hl7.org/fhir/StructureDefinition/Practitioner | http://hl7.org/fhir/StructureDefinition/Group | http://hl7.org/fhir/StructureDefinition/Device)
error The type "Reference(ResearchStudy)" does not match any of the allowed types: Reference(http://hl7.org/fhir/StructureDefinition/Patient | http://hl7.org/fhir/StructureDefinition/Practitioner | http://hl7.org/fhir/StructureDefinition/Group | http://hl7.org/fhir/StructureDefinition/Device)
error The type "Reference(Specimen)" does not match any of the allowed types: Reference(http://hl7.org/fhir/StructureDefinition/Patient | http://hl7.org/fhir/StructureDefinition/Practitioner | http://hl7.org/fhir/StructureDefinition/Group | http://hl7.org/fhir/StructureDefinition/Device)
RobertJCarroll commented 3 years ago

Thanks for pulling this together. Regarding the use case, there are some lookup / reverse lookup capabilities that will help here, so we don't have to link studies directly in.

For cases with multiple individual's data in a single file, we have two options that come to mind: 1) Create a series of DocumentReferences, one for each individual in the file, each pointing to the same file. 2) Create a group that holds all individuals in the same file.

It does make me wonder if it is wise to develop some "test cases" that we need API calls or similar to address with some of these changes as we get going.