qlu-lab / SUPERGNOVA

MIT License
23 stars 10 forks source link

SUPERGNOVA

SUPERGNOVA (SUPER GeNetic cOVariance Analyzer) is a statistical framework to perform local genetic covariance analysis. SUPERGNOVA only needs GWAS summary data and a reference panel as input data. The preprint is available at biorxiv.

Requirements

The software is developed and tested in Linux and Mac OS environments. It can perform multi-thread computing. The following softwares and packages are required:

  1. Python 3
  2. numpy
  3. scipy
  4. pandas
  5. sklearn
  6. bitarray

To perform multi-thread computing, you need to request multiple cores from your server.

Tutorial

You can download SUPERGNOVA by:

$ git clone https://github.com/qlu-lab/SUPERGNOVA
$ cd ./SUPERGNOVA

Suppose you would like to calculate local genetic covariance between autism spectrum disorder and cognitive performance. We'll need a few types of files:

$ mkdir ./data
$mkdir ./data/sumstats
$ wget ftp://ftp.biostat.wisc.edu/pub/lu_group/Projects/SUPERGNOVA/sumstats/*.txt.sumstats.gz -P ./data/sumstats/

More details about these supplied files can be found in here.

You may run the following command:

python3 supergnova.py ./data/sumstats/ASD.txt.sumstats.gz ./data/sumstats/CP.txt.sumstats.gz \
--N1 46351 \
--N2 257828 \
--bfile data/bfiles/eur_chr@_SNPmaf5 \
--partition data/partition/eur_chr@.bed \
--out results.txt

Explanation of Command-Line Arguments

Additional Command-Line Arguments

Explanation of Output

The output will be a whitespace-delimited text file, with the rows corresponding to different annotations and the columns as such:

NOTE: The true heritability of some genomic regions for some traits may be very small. Although methods for estimating local heritability exist, they may provide unstable, in many cases negative heritability estimates. SUPERGNOVA ignores negative heritability estimates, leaving the correlation estimates as 'NA'. So, we recommend the users to focus on genetic covariance instead of genetic correlation when performing local genetic covariance analysis.

Credits

Those using the SUPERGNOVA software should cite: Zhang, Y.L. et al. SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits. 2020.

The LD score calculation and the estimation of phenotypic covariance are adapted from ldsc.py in ldsc and ldsc_thin.py in GNOVA. See Bulik-Sullivan, B. et al. An Atlas of Genetic Correlations across Human Diseases and Traits. Nature Genetics, 2015. and Lu, Q.S. et al. A powerful approach to estimating annotation-stratified genetic covariance using GWAS summary statistics. The American Journal of Human Genetics, 2017.

Cite the code: DOI