mkirchler / transferGWAS

Repository for transferGWAS, a deep learning method for performing genome-wide association studies on full medical imaging data.
MIT License
11 stars 3 forks source link

transferGWAS

transferGWAS is a method for performing genome-wide association studies on whole images. This repository provides code to run your own transferGWAS on UK Biobank or your own data. transferGWAS has 3 steps: 1. pretraining, 2. feature condensation, and 3. LMM association analysis. Since the three steps require different compute infrastructure (GPU vs CPU server) and different parts can take longer time (e.g. pretraining can take a few days on a GPU), the parts are kept separate.

Getting started

This repository requires bash and was written and tested on Ubuntu 18.04.4 LTS.

Start by cloning this repo:

git clone https://github.com/mkirchler/transferGWAS.git

You can download pretrained models and BOLT-LMM via

./download_models.sh

This includes the CNN pretrained on the EyePACS dataset to predict Diabetic Retinopathy and the StyleGAN2 on retinal fundus images for the simulation study (the ImageNet-pretrained network is included in the pytorch library), as well as BOLT-LMM version 2.3.4.

Python

All parts require python 3.6+, and all deep learning parts are built in pytorch. We recommend using some up-to-date version of anaconda and then creating a new environment from the environment.yml:

conda env create --file environment.yml
conda activate transfer_gwas

If you want to run part of the non-deep learning code (especially the BOLT-LMM) on a CPU-only machine, use the environment_cpu.yml file for that:

conda env create --file environment_cpu.yml
conda activate transfer_gwas_cpu

Note that this won't install any of the pytorch libraries - you can only use it for the run_bolt and for stages 1 and 4 in the simulation.

Installation of requirements should not take longer than a few minutes (depending on internet connection).

Reproducing paper results

To reproduce results from our paper, see the reproducibility directory.

Running a transferGWAS

If you don't want to train your own network, just:

If you do want to train your own network, first check out the pretraining part first.