Closed ssarrafan closed 7 months ago
@dwinston and @dehays can one of you own this issue (maybe do the first draft?) so we can try having GH issues assigned to one person at a time?
For Sprint 3 - filter rules to be applied post ETL (against both GOLD and EMSL sourced metadata):
biosamples - must be part_of one of the included studies
omics_processing - must be part_of one of the included studies
must have a non empty / non null has_input value (must be associated with input biosamples).
(The above should be true already as the ETL will drop non conformant omics_processing and
has_input is now required.)
must be referenced by at least one was_informed_by analysis activity (must have associated analysis)
For Sprint 3 - filter rules to be applied post ETL (against both GOLD and EMSL sourced metadata):
biosamples - must be part_of one of the included studies
omics_processing - must be part_of one of the included studies must have a non empty / non null has_input value (must be associated with input biosamples). (The above should be true already as the ETL will drop non conformant omics_processing and has_input is now required.) must be referenced by at least one was_informed_by analysis activity (must have associated analysis)
Thanks for documenting this @dehays. Can this issue be closed? Or does it need to go to July or the backlog?
Since this is the last open issue in sprint 3 and I haven't heard back on what to do with it I'm going to move it to the July sprint. But if this is not a priority for July please let me know so I can remove it from the sprint. @dehays @emileyfadrosh @kfagnan
I'm moving this to the 'backlog' and removing from active sprints
Closing this old issue from 2021. FYI @kfagnan @emileyfadrosh @lamccue
Backlog cleanup 12-2023
This issue is related to issues 312 and 309: https://github.com/microbiomedata/nmdc-metadata/issues/309 https://github.com/microbiomedata/nmdc-metadata/issues/312
@dehays and @dwinston to implement business logic for JGI and EMSL to filter appropriate data into NMDC