PGScatalog / pgsc_calc

The Polygenic Score Catalog Calculator is a nextflow pipeline for polygenic score calculation
https://pgsc-calc.readthedocs.io/en/latest/
Apache License 2.0
106 stars 19 forks source link

Extra QA for scoring #141

Open smlmbrt opened 1 year ago

smlmbrt commented 1 year ago

Description of feature

Add additional checks to make sure all variants in the scoring file have been calculated on the samples. Currently this check is only for test data, but it should run on real data to ensure the SUMs are always directly comparable across datasets). Related to #139

smlmbrt commented 4 months ago

In #244 we make sure that all scoring files have yielded results.

DarioS commented 4 months ago

Does "score correlation tests" refer to something like a heatmap? It would be nice to see one as standard in the HTML report. image This plot tell us whether newer PGS are genuinely novel or are more of the same and of dubious value.

smlmbrt commented 3 months ago

Specifically checking that the .vars file is identical to the variants in the scoring file using a diff command?

smlmbrt commented 3 months ago

Does "score correlation tests" refer to something like a heatmap? It would be nice to see one as standard in the HTML report. image This plot tell us whether newer PGS are genuinely novel or are more of the same and of dubious value.

No, this seems too related to custom analyses and not general use of the pipeline. It's also trivial to do by reading in the pgs file, pivoting wide, and running cor in R.