microbiomedata / nmdc_automation

Prototype automation
2 stars 2 forks source link

Issues with configs/import.yaml #231

Closed aclum closed 1 day ago

aclum commented 1 month ago

Running run-import with the existing yaml throws an error because it is setting has_input to metagenome_sequencing_activity_set to: None This is no longer allowed by nmdc-schema -see this commit

More over it is planned to deprecate metagenome_sequencing_activity_set because it is redundant with omics_processing. When I try to remove the generation of this activity and update the input of Metagenome Raw Reads to be reads qc then it doesn't set the has_input for ReadQcAnalysisActivity properly

error (nersc-python) nmdcda@perlmutter:login35:/global/cfs/cdirs/m3408/users/nmdcda/repos/nmdc_automation/nmdc_automation/run_process> poetry run python run_import.py project-import /global/cfs/cdirs/m3408/users/aclum/software/nmdc/nmdc_automation/missing_neon.tsv ../../configs/import.yaml /global/cfs/cdirs/m3408/users/aclum/software/nmdc/nmdc_automation/site_configuration_delete_after_use.toml {'detail': '{\'result\': \'errors\', \'detail\': {\'data_object_set\': [], \'metagenome_assembly_set\': [], \'metagenome_sequencing_activity_set\': ["\'None\' does not match \'^(nmdc):(bsm|procsm)-([0-9][a-z]{0,6}[0-9])-([A-Za-z0-9]{1,})$\'"], \'read_qc_analysis_activity_set\': []}}'}

Alternatives considered: populate has_input for metagenome_sequencing_activity_set w/ the biosample or processed sample ID.

This came up when trying to address https://github.com/microbiomedata/issues/issues/819

cc @scanon @mbthornton-lbl

aclum commented 1 month ago

If I remove this then the inputs to readqc doesn't import properly.

aclum commented 4 weeks ago

This ticket is just to remove MetagenomeSequencingActivity, a new ticket will be to update the logic to populate the OmicsProcessing record with has_output and to use the OmicsProcessing record to fetch has_input to the ReadQC record.

aclum commented 3 weeks ago

blocked on #239 removing from sprint.