Closed jdhayhurst closed 2 years ago
rather than a black/white list, we can actually use the metadata for the files to store the validation status - if valid, harmonised; else don't.
Note: if files are "flipped" into a published state, we need to a) update this in the YAML + b) touch the sumstats file, so that they will be picked up by the above.
To reduce the amount of "glue" we can add this into the ftp sync process - https://github.com/EBISPOT/gwas-utils/ftpSummaryStatsScript/ftp_sync.py
The harmonisation pipeline will process anything in a "ready to harmonised" directory. We need a script to deposit files into this directory assuming they are eligible.
Reduce “glue” by adding this to the existing Sumstats (ftp) release script, which already needs to identify and release newly submitted files.
By adding into the existing “nighlty sumstast ftp sync” script, we increase the complexity of what that script does but we remove another gluey script. We can remove another glue script by using the publishing directives of nextflow.
more info: https://docs.google.com/document/d/1b1g9PIUH6B688_aqBIaZulOtgCEnEXNcirJ1vUUmJq8/edit#