microbiomedata / nmdc_automation

Prototype automation
3 stars 2 forks source link

Napa reid data and logs #187

Open mbthornton-lbl opened 1 month ago

mbthornton-lbl commented 1 month ago

Data and log files from the Napa re-id process

aclum commented 1 week ago

Also it appears the _updated_record_identifiers.tsv are not the final version for all studies. nmdc:sty-11-547rwq94 should have metagenome_assembly_set, read_based_taxonomy_analysis_activity_set, read_qc_analysis_activity_set Same for nmdc:sty-11-076c9980. Please make sure if workflow activity results were retained for a study that there the _updated_record_identifiers.tsv contains the mapping. This is particularly important for the workflow activities because we didn't keep the alternative identifiers in the mongo records.

aclum commented 1 week ago

*_updated_record_identifiers.tsv file is missing workflow activity mappings for nmdc:sty-11-1t150432

aclum commented 1 week ago

Double check but I think nmdc:sty-11-33fbta56_updated_record_identifiers_l.tsv should be kept over nmdc:sty-11-33fbta56_updated_record_identifiers.tsv b/c nmdc:sty-11-33fbta56_updated_record_identifiers.tsv is missing metagenome_assembly_set, read_based_taxonomy_analysis_activity_set, read_qc_analysis_activity_set mappings.

same for nmdc:sty-11-aygzgv51_updated_record_identifiers*tsv files

aclum commented 1 week ago

nmdc:sty-11-dcqce727_updated_record_identifiers.tsv is missing sequencing workflow activiites