microbiomedata / issues

public repo for issues related to NMDC work
2 stars 1 forks source link

NEON soil GOLD records check #422

Closed aclum closed 1 year ago

aclum commented 1 year ago

Some of the legacy NEON data for metagenomes (DP1.10107.001) appears to have records in the mms_metagenomeDnaExtraction and mms_metagenomeSequencing for both the individual samples and the pooled samples. Example cut -d ',' -f 8,9,10 NEON.D01.HARV.DP*mms_metagenomeDnaExtraction.2013-11.expanded.20230113T225132Z.csv | grep HARV_001 | grep 20131122 "HARV_001-M-35-29-20131122-GEN","HARV.001.05.02.M.35.29.20131122","HARV_001-M-35-29-20131122-GEN-DNA1" "HARV_001-O-35-29-20131122-GEN","HARV.001.05.01.O.35.29.20131122","HARV_001-O-35-29-20131122-GEN-DNA1" "HARV_001-M-20131122-COMP","HARV.001.xx.xx.Mineral.x.x.20131122","HARV_001-M-20131122-COMP-DNA1" "HARV_001-O-20131122-COMP","HARV.001.xx.xx.Organic.x.x.20131122","HARV_001-O-20131122-COMP-DNA1" "HARV_001-O-30-7-20131122-GEN","HARV.001.05.05.O.30.7.20131122","HARV_001-O-30-7-20131122-GEN-DNA1" "HARV_001-M-30-7-20131122-GEN","HARV.001.05.06.M.30.7.20131122","HARV_001-M-30-7-20131122-GEN-DNA1" "HARV_001-M-13-7-20131122-GEN","HARV.001.05.04.M.13.7.20131122","HARV_001-M-13-7-20131122-GEN-DNA1" "HARV_001-O-13-7-20131122-GEN","HARV.001.05.03.O.13.7.20131122","HARV_001-O-13-7-20131122-GEN-DNA1"

Check w/Hugh to see if NMDC is missing omics_processing_set records or if GOLD needs to remove/merge some biosample records.

aclum commented 1 year ago

The feedback from Hugh is that both the individual sample and the composite sample were sequenced so we need to go back to the soil samples and make extraction_set -> omics processing set records for these.

sujaypatil96 commented 7 months ago

@aclum I couldn't find any of the above dnaSampleID values in the mms_metagenomeSequencing tables so I think it's just mms_metagenomeDnaExtraction.