xihaoli / STAAR

An R package for performing STAAR procedure in whole-genome sequencing studies
GNU General Public License v3.0
90 stars 42 forks source link

Ranks Used For Generating Phred Scores For Functional Annotations #13

Closed HasanA75 closed 2 years ago

HasanA75 commented 2 years ago

Hello Xihao,

If we just want to analyze some selected candidate regions in the genome (e.g., 100 genes or one chromosome), should we use the rank relative to the whole genome or the rank among the variants that are included in the analysis or even a mixture of them? We ask about it because we do have pre-calculated genome-wide rank scores for certain annotations, but not for every single annotation that we may want to include. So our choices could be below. Could you let us know which options could work?
1) For all the annotation categories that we are interested in, get the rank relative to all the variants included in the analysis (e.g., if we only analyzed 10k variants across 100 genes, then the rank should be from 1-10k in each category) and disregard the genome-wide rank score. 2) Always get the genome-wide ranks (could range from 1-100 million) for all the variants even if we only analyzed 10k variants in about 100 genes. 3) For the annotations with pre-calculated genome-wide rank, use genome-wide ranks; for the annotations without pre-calculated genome-wide rank, rank them within the variants included in the analysis. The rank would calibrate itself within its own category.

xihaoli commented 2 years ago

Hi Hasan,

Thanks for your question. Based on your description, all three options should be statistically valid (meaning that the type I error rate would not be inflated), though the power and resulting interpretations could slightly differ. In our original STAAR paper, we used genome-wide ranks for all annotations since we performed genome-wide scans for both gene-centric and genetic region (2kb sliding window) analysis of the WGS data.

Best, Xihao

HasanA75 commented 2 years ago

Hello Xihao,

Thank you for this.

Sincerely, Hasan