Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
118 stars 22 forks source link

rsid_to_chrpos failed #63

Closed chenyangjjj closed 8 months ago

chenyangjjj commented 8 months ago

Hi Yunye,

I used rsid_to_chrpos to assign the CHR column of data. But It failed for 99% variants. Do you know the possible reason for this

test2 = test.data.head(10) test2_gl = gl.Sumstats(test2,rsid="rsID",sep=",",build="38") test2_gl.rsid_to_chrpos(path = gl.get_path("1kg_dbsnp151_hg38_auto"))

first several rows of Input looks like:   rsID | POS | STATUS rs1268887972 | 10550 | 3899999 rs1834149098 | 10552 | 3899999 rs1834149149 | 10554 | 3899999 rs1271555979 | 10570 | 3899999 rs1207895240 | 10572 | 3899999 rs1232471101 | 10574 | 3899999 rs1275034139 | 10576 | 3899999 rs1352235814 | 10579 | 3899999 rs1191964414 | 10587 | 3899999 rs1409842294 | 10598 | 3899999

image

first several rows of outputs looks like:

image
Cloufield commented 8 months ago

Hi The conversion table I prepared only covers the variants in 1KG datasets (~ 80M variants). It seems that most rsIDs in your list are not existing in 1KG datasets. You can check the reference file path using gl.get_path("1kg_dbsnp151_hg38_auto"). For assigning rsIDs not existing in 1kg, you may need to prepare a similar reference file covering the scope of your list.

chenyangjjj commented 8 months ago

I agree it is the reference file issue. I will try to create one for my own task. Thanks!