microbiomedata / nmdc-runtime

Runtime system for NMDC data management and orchestration
https://microbiomedata.github.io/nmdc-runtime/
Other
5 stars 3 forks source link

review ETL scripts to exclude slot name if the value is empty or null #373

Open aclum opened 10 months ago

aclum commented 10 months ago

Please review ingest pipelines to exclude slots if the value of the slot is an empty list or has a value of null. Based on a review of the data in mongo this has happened in the past with the following slots. These could have come from other ingest code (ie EMSL ingest, there will be a separate ticket for workflows).

study_set gold_study_identifiers omics_processing_set,gold_sequencing_project_identifiers biosample_set,gold_biosample_identifiers data_object_set,alternative_identifiers

ingest pipelines to review

related to https://github.com/microbiomedata/nmdc-schema/issues/1306

aclum commented 1 day ago

Specific example with orcid for submission portal code here #711