microbiomedata / nmdc_automation

Prototype automation
2 stars 2 forks source link

import automation logic when import files don't have upstream records. #260

Open aclum opened 1 month ago

aclum commented 1 month ago

Discovered in testing of berkeley schema, specifically related to assert statements for test_workflow_execution_mapper in test_imports.py

The test has 4 files that pertain to two workflow executions but the test makes 4 workflow_execution_set records based on Import: true from the yaml file. These records will have an empty has_output. Technically the API still allows this but we shouldn't be doing this and if you fix https://github.com/microbiomedata/nmdc_automation/issues/259 then the key wouldn't exist and then you'd get an error from the API about a missing required key. Low priority for now b/c It has not been a use case that we need to import annotations w/o upstream records but this could happen and we'd need to account for that. The code would need information about the has_output data object identifiers for the upstream workflow record and there is no logic to handle this right now.