microbiomedata / nmdc-schema

National Microbiome Data Collaborative (NMDC) unified data model
https://microbiomedata.github.io/nmdc-schema/
Creative Commons Zero v1.0 Universal
27 stars 8 forks source link

Map JGI template to NMDC slots, add alias #1767

Closed mslarae13 closed 8 months ago

mslarae13 commented 9 months ago

Due 02/09

All NMDC slots for JGI metadata should have the exact JGI metadata field mapped using alias or 'exact mapping'

mslarae13 commented 9 months ago

https://github.com/microbiomedata/nmdc-schema/issues/1107#issuecomment-1870567071

mslarae13 commented 8 months ago

Work on this did start, however, we discussed https://github.com/microbiomedata/nmdc-schema/issues/1454

Where maybe some slot mappings aren't needed because we won't need to capture them in mongoDB.

mslarae13 commented 8 months ago

The following slots are UF specific & should be tracked. Add mappings for these slots

project_id | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/emsl.yaml | project ID | EMSL | remove from submission portal & colmplete it with the project # provided at 'multiomics data' tab -- | -- | -- | -- | -- dna_isolate_meth | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/jgi_metagenomics.yaml | Describe the method/protocol/kit used to extract DNA/RNA. | JGI | Berkeley schema modeling of protocol_link.name of class Extraction dna_seq_project | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/jgi_metagenomics.yaml |   | JGI | Will pull from JGI API, maps to the JGI project ID, not biosample, remove from submission portal & colmplete it with the project # provided at 'multiomics data' tab, useful to have associated on sample prep classes and OmicsProcessing dnase_rna | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/jgi_metatranscriptomics.yaml |   | JGI | On either ProcessedSample or Prep metadata, not on biosample rna_isolate_meth | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/jgi_metatranscriptomics.yaml | Describe the method/protocol/kit used to extract DNA/RNA. | JGI | Berkeley schema modeling of protocol_link.name of class Extraction rna_seq_project | https://github.com/microbiomedata/nmdc-schema/blob/main/src/schema/portal/jgi_metatranscriptomics.yaml |   | JGI | Will pull from JGI API, maps to the JGI project ID, not biosample, remove from submission portal & colmplete it with the project # provided at 'multiomics data' tab, useful to have associated on sample prep classes and OmicsProcessing
mslarae13 commented 8 months ago

@turbomam in light of the few slots NMDC schema does need to track, compared to the few it doesn't #1454.. should these remain in emsl.yaml and the 2 jgi_*.yaml files? Or should we move them into nmdc.yam or basic_slots or core? (Cuz I can't keep track of what's what.

mslarae13 commented 8 months ago

From slack

@turbomam "If some UF slots are gong to be moved to submission-schema in the near future, then I would prefer not to move them around within the nmdc-schema modules. I think Alicia and I moved some slots because she was using them outside of their original UF use-case, and that kept the schema from building. We should have included you ion that decison. I'm working on an issue to make all modules self sufficient (ie build on their own), and this will address the question you have asked along the line "how do I know which module to put new content in?" Can we leave the UF slots where they are until then?"

In summary for user facility slots we want to keep in NMDC schema, I'll add alias (this issue)

In a later task we'll complete issue https://github.com/microbiomedata/nmdc-schema/issues/1454. , getting the user facility slots that NMDC does NOT need to track removed from NMDC schema, and only have them in NMDC submission portal. We will also later decide if the slots that we DO capture should remain in separate .yaml files or be moved to basic_slots.