PGScatalog / pgscatalog_utils

(superseded by pygscatalog) Utilities for working with PGS Catalog API and scoring files
Apache License 2.0
4 stars 3 forks source link

PGS002807 contains variant beyond chromosome size #86

Open mfasold opened 4 months ago

mfasold commented 4 months ago

I am not sure if this is a database issue, or due to pgscatalog_utils.

If you download the scoring file of PGS002807

pgscatalog-download -i PGS002807 -o . -b GRCh38

the result contains the line

19 101658108 G A 0.0001219005 Author-reported 19 101658108 True True

However, chromosome 19 in hg38 only has a length of 58617616, leading to problems in downstream analyses.

nebfield commented 3 months ago

Thanks for the report! This looks like a validation issue with author-submitted data:

$ pgscatalog-download -i PGS002807 -o . # grab original data submitted by author
$ zgrep 101658108 PGS002807.txt.gz
19  101658108   G   A   0.0001219005

We'll have a look at the score validation process.