biona001 / GhostKnockoffGWAS

Knockoff-based analysis of GWAS summary statistics data
MIT License
8 stars 1 forks source link

Custom Reference Panel in PLINK Format #28

Closed AFSHINJAM closed 4 months ago

AFSHINJAM commented 5 months ago

Hello Benjamin,

I am attempting to use the GhostKnockoffGWAS tool with my own reference panel for chr17 in PLINK format (bed, bim, fam files). I have converted my PLINK files to VCF format and attempted to use the solve_block function (https://biona001.github.io/GhostKnockoffGWAS/dev/man/solveblocks/), but I am unable to locate the solve_block executable in the bin directory and receive an error.

Could you provide guidance on: How to properly integrate a custom reference panel in PLINK format? The correct way to locate or generate the solve_block executable?

Thank you for your time and for developing this valuable tool.

biona001 commented 5 months ago

Hi @AFSHINJAM

Thanks for reaching out. To answer your questions,

  1. We currently do not allow users to customize LD files based on PLINK (.bed/bim/fam) files. The reason is mainly because PLINK files do not have a notion of ref/alt alleles - they get converted into allele1 and allele2 based on MAF (whichever is rarer becomes A2 allele). Thus, if you directly converted your PLINK data into VCF format, there's a high chance the ref/alt alleles are arbitrarily assigned within the VCFs -- if this is the case you will need to re-align the ref/alt against a reference panel using e.g. conform-gt software. I will add these description into the documentation in the near future, after I come back from vacation.
  2. If you download the latest app_linux_x86.tar.gz (currently v0.2.1), and extract its contents, the executable is located inside app_linux_x86/bin/solveblock.

Our of curiosity, is there a reason you need to generate your own LD files? Why don't you download the ones we've prepared?

AFSHINJAM commented 4 months ago

Hi @biona001,

Thank you for your quick response! I was able to successfully generate LD files using the solveblock function with individual-level data stored in VCF format instead of using PLINK files. To answer your question about why we are generating our own LD files: We want to use our own to compare with other approaches that use different LD file, but also to evaluate how sensitive the approach is to the choice of LD source.

Thank you again.

biona001 commented 4 months ago

@AFSHINJAM Good to know that the new feature worked for you. Although I'm a little concerned about the ref/alt issue, I guess it could be avoided if you are careful enough. Let me know if you have any further questions.