input format for summary statistics for munging?

omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)

MIT License

86 stars 21 forks source link

hello there,

Thanks for this wonderful software and Wikipedia. I was able to install it seamlessly using conda. I have GWAS summary statistics for an admixed minority population. The summary statistics follow CHR:POS:Allele1:Allele2 format. Data are imputed in-house, therefore SNP names follow this pattern than rsids.

I would like create parquet format for the summary stats file.

May I know what column headers are needed with munge_polyfun_sumstats.py? Also, is there any specific order or columns that is must with the script?

with SNP rsids, chromosome and base pair info, and either a p-value, an effect size estimate and its standard error, a Z-score or a p-value.

Sample data from BOLT-LMM has columns as: SNP, CHR, BP, INFO, BETA, SE

Thanks.

omerwe / polyfun

input format for summary statistics for munging? #37