PGScatalog / pgscatalog_utils

(superseded by pygscatalog) Utilities for working with PGS Catalog API and scoring files
Apache License 2.0
4 stars 3 forks source link

Read IIDs as text #65

Closed smlmbrt closed 6 months ago

smlmbrt commented 9 months ago
          I finally was able to reproduce this bug - it happens when all the IDs in the psam are numeric! 
$ cat numeric_OCE.psam | head
#IID    SEX     population      latitude        longitude       region
655     1       Bougainville    -6      155     OCEANIA
$ cat target_pcs/001.pcs | head
IID     PC1     PC2     PC3     PC4     PC5     PC6     PC7     PC8     PC9     PC10
655     -20.6509        28.8798 -18.9365        -0.6973 -1.0790 0.0627  -0.8486 2.1669  -14.1303        -8.9729
$ gzcat aggregated_scores.txt.gz | head
sampleset       IID     DENOM   PGS000004_hmPOS_GRCh38_SUM      PGS000018_hmPOS_GRCh38_SUM      PGS000027_hmPOS_GRCh38_SUM      PGS000036_hmPOS_GRCh38_SUM      PGS000065_hmPOS_GRCh38_SUM   PGS000889_hmPOS_GRCh38_SUM      PGS003436_hmPOS_GRCh38_SUM      PGS000004_hmPOS_GRCh38_AVG      PGS000018_hmPOS_GRCh38_AVG      PGS000027_hmPOS_GRCh38_AVG      PGS000036_hmPOS_GRCh38_AVG   PGS000065_hmPOS_GRCh38_AVG      PGS000889_hmPOS_GRCh38_AVG      PGS003436_hmPOS_GRCh38_AVG
HGDP    655     7300910.0       -0.93377        0.41891999999999996     38.78698        -2359.2295599999998     -0.11405929999999999    41.225443       4.27355 -1.2789775521133666e-07      5.7379148626678035e-08  5.312622673064043e-06   -0.0003231418494406861  -1.5622614167275037e-08 5.646617065543884e-06   5.853448405746681e-07
reference       HG00096 7300910.0       -0.47219999999999995    -0.3971499999999999     38.28005        -2272.516       -0.3333011      39.4819 4.93062 -6.467686904783102e-08  -5.4397328552194165e-08      5.243188862758204e-06   -0.0003112647601463379  -4.565199406649308e-08  5.407805328376874e-06   6.753432106408653e-07

Fix handling of numeric-only IIDs in:

_Originally posted by @smlmbrt in https://github.com/PGScatalog/pgsc_calc/issues/177#issuecomment-1805938524_

nebfield commented 8 months ago

https://github.com/PGScatalog/pgscatalog_utils/releases/tag/v0.4.3

smlmbrt commented 8 months ago

Reopening because it was only fixed in ancestry

smlmbrt commented 8 months ago

Checked that this won't effect fraposa_pgsc because pyplink reader reads IID as string (https://github.com/lemieuxl/pyplink/blob/710b4270c2ef8b90ba82c960c3087e046fe0a654/pyplink/pyplink.py#L333-L339)

smlmbrt commented 6 months ago

Closed by #78