chrchang / plink-ng

A comprehensive update to the PLINK association analysis toolset. Beta testing of the first new version (1.90), focused on speed and memory efficiency improvements, is finishing up. Development is now focused on building out support for multiallelic, phased, and dosage data in PLINK 2.0.
https://www.cog-genomics.org/plink/2.0/
408 stars 123 forks source link

--score precision error #248

Closed smlmbrt closed 1 year ago

smlmbrt commented 1 year ago

Hi there, I've noticed that the SUM and AVG are missing a decimal place of precision when you use the plink2 --score command. You can see it clearly with a scoring file with 1 variant:

ID      effect_allele   pgs_af
22:20441888:T:C T       -4.997371e-05

And the results scoring it on HGDP:

#IID    population      latitude        longitude       region  ALLELE_CT       DENOM   NAMED_ALLELE_DOSAGE_SUM pgs_af_AVG      pgs_af_SUM
HGDP00001       Brahui  30.5    66.5    CENTRAL_SOUTH_ASIA      2       2       1       -2.49869e-05    -4.99737e-05
HGDP00003       Brahui  30.5    66.5    CENTRAL_SOUTH_ASIA      2       2       0       0       0
HGDP00005       Brahui  30.5    66.5    CENTRAL_SOUTH_ASIA      2       2       1       -2.49869e-05    -4.99737e-05
HGDP00007       Brahui  30.5    66.5    CENTRAL_SOUTH_ASIA      2       2       2       -4.99737e-05    -9.99474e-05
[... the rest are omitted]

The possible values of SUM should be [0, -4.997371e-05, -9.994742e-05], but they are [0, 4.99737e-05, -9.99474e-05]. I get the same results in v1.90b7, v2.00a3.3, and v2.00a4. This didn't change when I provided the weight as -0.00004997371 either.

chrchang commented 1 year ago

Not a bug. Practically all plink2 commands print floating-point values to 6 decimal places; printing to "full precision" would inflate file sizes while adding negligible scientific value. The source code is available if you actually need more digits of precision to be printed here.

smlmbrt commented 1 year ago

@chrchang, thanks for the explanation! So if I understand correctly the full precision is used for the calculation internally but it just prints to 6 decimal places?

chrchang commented 1 year ago

That is correct.