PowerBacGWAS is a computational pipeline to conduct power calculations for bacterial GWAS. It uses existing collections of bacterial genomes to establish the sample sizes required to detect statistical significant associations for a given genotype frequency and effect size (or phenotype heritability). It supports a range of genomic variation including SNPs, indels, and variation in gene content (pan-genome). Here, we make the code available, and provide installation and usage instructions. PowerBacGWAS can be applied to any bacterial population Here we applied it to three different bacterial species: Enterococcus faecium, Klebsiella pneumoniae, and Mycobacterium tuberculosis.
The easiest and recommended way to install and run PowerBacGWAS is via its Docker/Nextflow implementation.
You will need to:
git clone https://github.com/francesccoll/powerbacgwas/
cd powerbacgwas/nextflow
nextflow run main.nf --help
See the PowerBacGWAS wiki page for examples of Nextflow commands.
PowerBacGWAS consists of a set of Python and R scripts that would work provided that all required dependencies below (both python modules and software) are installed in your local machine.
Download the latest release from this github repository or clone it.
git clone https://github.com/francesccoll/powerbacgwas/
cd powerbacgwas/
As the pipeline uses scripts from PastML and PySeer, clone their GitHub directories into the downloaded powerbacgwas folder:
git clone https://github.com/evolbioinfo/pastml
git clone https://github.com/mgalardini/pyseer
Please read the PowerBacGWAS wiki page for full usage instructions and tutorials.
PowerBacGWAS is a free software, licensed under GNU General Public License v3.0
Use the issues page to report on installation and usage issues.
Not available yet