Hi Sam,

Thank you for the great application. I am using prsice/2.3.3 for a case/control phenotype. Below is the .summary file content. The Num_SNP value is the same as the number of snps in the .snp file. Is this correct? When I filter the .snp file for pvalues <= 0.0160843, I get about 13K snps.

Phenotype Set Threshold PRS.R2 Full.R2 Null.R2 Prevalence Coefficient Standard.Error P Num_SNP

Base 1 0.0160843 0.0277735 0.0116893 - 29150.8 2266.75 7.53813e-38 283748

PRSice 2.3.3 (2020-08-05) https://github.com/choishingwan/PRSice (C) 2016-2020 Shing Wan (Sam) Choi and Paul F. O'Reilly GNU General Public License v3 If you use PRSice in any published work, please cite: Choi SW, O'Reilly PF. PRSice-2: Polygenic Risk Score Software for Biobank-Scale Data. GigaScience 8, no. 7 (July 1, 2019) 2023-04-08 11:42:55 /usr/local/apps/prsice/2.3.3/bin/PRSice \ --a1 effect_allele \ --a2 other_allele \ --all-score \ --bar-levels 0.001,0.05,0.1,0.2,0.3,0.4,0.5,1 \ --base base.QC_fin.gz \ --binary-target T \ --bp base_pair_location \ --chr chromosome \ --clump-kb 250kb \ --clump-p 1.000000 \ --clump-r2 0.100000 \ --cov UK_BB_pheno.tsv \ --cov-col @PC[1-10],sex \ --cov-factor sex \ --interval 5e-05 \ --lower 5e-08 \ --num-auto 22 \ --or \ --out results/pheno \ --pheno UK_BB_pheno.tsv \ --pheno-col pheno \ --print-snp \ --pvalue p_value \ --seed 258927981 \ --snp variant:id \ --stat OR \ --target-list tune.list \ --thread 4 \ --upper 0.5

Initializing Genotype info from file: tune.list (bed)

Start processing base.QC_fin

Base file: base.QC_fin.gz GZ file detected. Header of file is:

chromosome base_pair_location effect_allele other_allele OR standard_error effect_allele_frequency p_value het_i2 het_p_value n_samples n_cases n_studies rsid variant:id

8864696 variant(s) observed in base file, with: 278670 variant(s) located on haploid chromosome 8586026 total variant(s) included from base file

Loading Genotype info from target

162402 people (75730 male(s), 86672 female(s)) observed 162402 founder(s) included

998009 variant(s) not found in previous data 5133529 variant(s) included

Phenotype file: UK_BB_pheno.tsv Column Name of Sample ID: FID+IID Note: If the phenotype file does not contain a header, the column name will be displayed as the Sample ID which is expected.

There are a total of 1 phenotype to process

Start performing clumping

Number of variant(s) after clumping : 283748

Processing the 1 th phenotype

pheno is a binary phenotype 161554 control(s) 848 case(s)

Processing the covariate file: UK_BB_pheno.tsv

Include Covariates: Name Missing Number of levels sex 0 2 PC1 0 - PC2 0 - PC3 0 - PC4 0 - PC5 0 - PC6 0 - PC7 0 - PC8 0 - PC9 0 - PC10 0 -

After reading the covariate file, 162402 sample(s) included in the analysis

There are 1 region(s) with p-value less than 1e-5. Please note that these results are inflated due to the overfitting inherent in finding the best-fit PRS (but it's still best to find the best-fit PRS!). You can use the --perm option (see manual) to calculate an empirical P-value.

choishingwan / PRSice

Num_SNP matches snps in .snp file #316

Start processing base.QC_fin

Loading Genotype info from target

Processing the covariate file: UK_BB_pheno.tsv