combine_scorefiles already reads headers as a dictionary. The information extracted from the PGS Catalog style-header should be combined into a log/json file indexed by the scoring file accession that includes:
[ ] Original genome build per scoring file
[ ] Original number of variants
[ ] Traits
[ ] PGS Name
[ ] Citation
[ ] Which columns were used, variant sources, harmonisation status
Changes will need to be made to pgsc_calc to use metadata from the json rather than an API call. This will have the added benefit of not requiring internet access for any step of the pipeline after download_scorefiles. Changes needed:
combine_scorefiles
already reads headers as a dictionary. The information extracted from the PGS Catalog style-header should be combined into a log/json file indexed by the scoring file accession that includes:Changes will need to be made to
pgsc_calc
to use metadata from the json rather than an API call. This will have the added benefit of not requiring internet access for any step of the pipeline afterdownload_scorefiles
. Changes needed:modules/local/score_report.nf
bin/report.Rmd