Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
118 stars 22 forks source link

rsID Annotation for need to be fixed #75

Closed chenyangjjj closed 5 months ago

chenyangjjj commented 5 months ago

Hi Yunye,

I am using assign_rsid to infer rsid of build 38, while received the error, "-rsID Annotation for need to be fixed!". The below is my code, And I only input a test data of 1000 SNPs, header below. Do you may know the reason? It works for 1kg_dbsnp151_hg19_auto seperately, but not for GCF_000001405.40.gz. I wonder "assign_rsid" function use the info from SNPID or combination of CHR,BP,REF,ALT separate column?

I removed the "chr" preflix from SNP and re-run, same issue

sumstats =gl.Sumstats("/home/chenyang.jiang/temp/Amsterdam_combined_topmed_imputation_qc03_QCed.test2",snpid="SNP",chrom="CHR",pos="BP",ref="ref",alt="alt",other=['keep'],sep="\t",build="38")   

sumstats.basic_check(n_cores = 20)
sumstats.assign_rsid(ref_rsid_tsv = gl.get_path("1kg_dbsnp151_hg38_auto"),ref_rsid_vcf = "GCF_000001405.40.gz",
               chr_dict = gl.get_number_to_NC(build="38"),
               n_cores = 20)
image
Cloufield commented 5 months ago

Hi, instead of ref/alt, you need to use ea/nea like: sumstats =gl.Sumstats("/home/chenyang.jiang/temp/Amsterdam_combined_topmed_imputation_qc03_QCed.test2",snpid="SNP",chrom="CHR",pos="BP",nea="ref",ea="alt",other=['keep'],sep="\t",build="38").

gwaslab uses CHR, POS, EA, and NEA to match rsID in VCF files. REF adn ALT are for other purposes (not implemented yet).

chenyangjjj commented 5 months ago

Thanks, It works now!