microbiomedata / nmdc-metadata

Managing metadata and policy around metadata in NMDC
https://microbiomedata.github.io/nmdc-schema/
Other
2 stars 0 forks source link

Create generalized rule(s) from business logic for JGI and EMSL to improve appropriate filtering of data #348

Closed ssarrafan closed 7 months ago

ssarrafan commented 3 years ago

This issue is related to issues 312 and 309: https://github.com/microbiomedata/nmdc-metadata/issues/309 https://github.com/microbiomedata/nmdc-metadata/issues/312

@dehays and @dwinston to implement business logic for JGI and EMSL to filter appropriate data into NMDC

ssarrafan commented 3 years ago

@dwinston and @dehays can one of you own this issue (maybe do the first draft?) so we can try having GH issues assigned to one person at a time?

dehays commented 3 years ago

For Sprint 3 - filter rules to be applied post ETL (against both GOLD and EMSL sourced metadata):

biosamples - must be part_of one of the included studies

omics_processing - must be part_of one of the included studies must have a non empty / non null has_input value (must be associated with input biosamples).
(The above should be true already as the ETL will drop non conformant omics_processing and has_input is now required.) must be referenced by at least one was_informed_by analysis activity (must have associated analysis)

ssarrafan commented 3 years ago

For Sprint 3 - filter rules to be applied post ETL (against both GOLD and EMSL sourced metadata):

biosamples - must be part_of one of the included studies

omics_processing - must be part_of one of the included studies must have a non empty / non null has_input value (must be associated with input biosamples). (The above should be true already as the ETL will drop non conformant omics_processing and has_input is now required.) must be referenced by at least one was_informed_by analysis activity (must have associated analysis)

Thanks for documenting this @dehays. Can this issue be closed? Or does it need to go to July or the backlog?

ssarrafan commented 3 years ago

Since this is the last open issue in sprint 3 and I haven't heard back on what to do with it I'm going to move it to the July sprint. But if this is not a priority for July please let me know so I can remove it from the sprint. @dehays @emileyfadrosh @kfagnan

ssarrafan commented 2 years ago

I'm moving this to the 'backlog' and removing from active sprints

ssarrafan commented 7 months ago

Closing this old issue from 2021. FYI @kfagnan @emileyfadrosh @lamccue

Backlog cleanup 12-2023