MRCIEU / gwasglue

Linking GWAS data to analytical tools in R
Other
112 stars 39 forks source link

gwasvcf_to_ldsc #23

Closed NathanSkene closed 3 years ago

NathanSkene commented 3 years ago

Hey,

I'm interested in using GWASVCF files to run LDSC. Seems like the best way to do it would be to create a gwasvcf_to_ldsc function here, right?

Wondering if you've had a go at doing this yet? Got any tips on setting it up? LDSC will expect the gwasvcf to be converted to a standard sumstats format (SNP CHR BP A1 A2 P Z): have you got any example code for doing this?

Thanks

Tagging in: @Al-Murphy @roxyisat-rex

bschilder commented 3 years ago

There's a package called VCF-kit that can convert vcf to tabular format easily:

https://vcf-kit.readthedocs.io/en/latest/vcf2tsv/

Example

vk vcf2tsv wide --print-header spliceai_scores.raw.snv.hg19.vcf.gz > spliceai_scores.raw.snv.hg19.tsv

explodecomputer commented 3 years ago

Thanks @NathanSkene @bschilder for the suggestion! We actually did fork ldsc and it can read gwasvcf files directly - https://github.com/explodecomputer/ldsc/

e.g.

./ldsc.py --h2 ieu-a-2.vcf.gz --w-ld-chr eur_w_ld_chr/ --ref-ld-chr eur_w_ld_chr/ --out results

I'll make a pull request to the original LDSC repository to see if they will incorporate these new changes.