Open pkalita-lbl opened 2 months ago
Checking with Alicia if NMDC needs to store these slots. If so, which ones?
https://github.com/microbiomedata/issues/issues/413#issuecomment-2075464756
Removing this from Sprint 35. Not adding to a future sprint right now because it sounds like we need further input before proceeding.
Decision was made on 06/12 during the metadata meeting
From @aclum in https://github.com/microbiomedata/issues/issues/413#issuecomment-2096644352
would like to keep dna_isolate_meth and map it to a slot on NMDC's Extraction class.
We want to track dna_isolate_meth in NMDC, but this is the only slot. We need to:
POST BERK
Montana correctly pointed out in this comment that our implementation of long-read metagenomics was somewhat incomplete.
The changes implemented for https://github.com/microbiomedata/submission-schema/issues/168 added a new
JgiMgLrInterface
class. It reuses slots that are also used by theJgiMgInterface
class. That makes sense from a pure LinkML perspective, but unfortunately it misses an important point about how submission data is brought into MongoDB where it adheres tonmdc-schema
.In the submission data one sample's metadata might be spread across multiple
submission-schema
class instances (e.g. aSoilInterface
instance and aJgiMgInterface
instance), linked together by the unique sample name. When going into Mongo those instances get collapsed into one instance of thenmdc-schema
Biosample
class. The issue is that if, in thesubmission-schema
data, one sample has both anJgiMgInterface
instance and aJgiMgLrInterface
the slots values for one will overwrite the other when squashing into aBiosample
instance.This is the reason why we currently need to have pairs of slots like
dna_absorb1
andrna_absorb1
instead of justabsorb1
. With the introduction of long-read MG metadata these need to become triples of slots (e.g.rna_absorb1
,dna_absorb1
, and -- new --dna_lr_absorb1
)