choishingwan / PRSice

A software package for calculating, applying, evaluating and plotting the results of polygenic risk scores
http://prsice.info
GNU General Public License v3.0
187 stars 89 forks source link

Sumscore option yields float PRS #366

Closed wiegertj closed 1 month ago

wiegertj commented 1 month ago

Hello,

I am running PRSice2 (v. 2.3.5) with a base file containing the effect size column BETA all set to 1 and the option --score sum. My objective is to calculate just the risk allele count, regardless of effect sizes. In my mind, this would result in all integer PRS values, however, while for lower P_ts I mostly obtain integers (although there are also some floats), especially for larger P_t I only get float PRS values.

Why is this the case? I already tried using the --hard option but I still get the same results, same for not using LD and not using --fastscore. Would be great to know what causes the non-integer PRS in this setup. Many thanks in advance!

The command for reference:

$PRSICE_EXEC \
    --a1 Allele1 \
    --a2 Allele2 \
    --bar-levels 1e-08,1e-07,1e-06,1e-05,1e-04,1e-03 \
    --base $BASE_FILE \
    --fastscore \
    --beta \
    --binary-target T \
    --bp POS \
    --chr CHR \
    --clump-kb 500kb \
    --clump-p 1.000000 \
    --clump-r2 0.200000 \
    --ld $LD_REFERENCE \
    --memory 4gb \
    --no-regress \
    --num-auto 22 \
    --out $OUTPUT_DIR\
    --pvalue p_value \
    --seed 1640568366 \
    --snp VARID \
    --stat BETA \
    --target $TARGET_FILE \
    --thread 1 \
    --score sum \
    --print-snp \
    --interval 5e-05 
Example output one sample:
FID IID Pt_1e-08 Pt_1e-07 Pt_1e-06 Pt_1e-05 Pt_0.0001 Pt_0.001
... ... 94 138 210 369.445698 725.977797 1605.38761
choishingwan commented 1 month ago

Missing data will be imputed using the maf, which is a double. To disable that, you will need to use —missing SET_ZERO

Sam

On Thu, Oct 3, 2024 at 3:39 AM Julius Wiegert @.***> wrote:

Hello,

I am running PRSice2 (v. 2.3.5) with a base file containing the effect size column BETA all set to 1 and the option --score sum. My objective is to calculate just the risk allele count, regardless of effect sizes. In my mind, this would result in all integer PRS values, however, while for lower P_ts I mostly obtain integers (although there are also some floats), especially for larger P_t I only get float PRS values.

Why is this the case? I already tried using the --hard option but I still get the same results, same for not using LD and not using --fastscore. Would be great to know what causes the non-integer PRS in this setup. Many thanks in advance!

The command for reference:

$PRSICE_EXEC \ --a1 Allele1 \ --a2 Allele2 \ --bar-levels 1e-08,1e-07,1e-06,1e-05,1e-04,1e-03 \ --base $BASE_FILE \ --fastscore \ --beta \ --binary-target T \ --bp POS \ --chr CHR \ --clump-kb 500kb \ --clump-p 1.000000 \ --clump-r2 0.200000 \ --ld $LD_REFERENCE \ --memory 4gb \ --no-regress \ --num-auto 22 \ --out $OUTPUT_DIR\ --pvalue p_value \ --seed 1640568366 \ --snp VARID \ --stat BETA \ --target $TARGET_FILE \ --thread 1 \ --score sum \ --print-snp \ --interval 5e-05

Example output one sample: FID IID Pt_1e-08 Pt_1e-07 Pt_1e-06 Pt_1e-05 Pt_0.0001 Pt_0.001 ... ... 94 138 210 369.445698 725.977797 1605.38761

— Reply to this email directly, view it on GitHub https://github.com/choishingwan/PRSice/issues/366, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJTRYT56L3INFWI2GT66TLZZTYE7AVCNFSM6AAAAABPJGL2ZOVHI2DSMVQWIX3LMV43ASLTON2WKOZSGU3DGMZUGY3DINA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

wiegertj commented 1 month ago

That was my problem 👍 Many thanks for the help and your great software!

Julius