genetics-statistics / faster_lmm_d

A faster lmm for GWAS. Supports GPU backend.
GNU General Public License v3.0
14 stars 6 forks source link
arrayfire bioinformatics dlang genomics gpu gwas gwas-tools opencl

Faster_lmm_d

Build Status

A faster lmm for GWAS. It has multi-core and GPU support.

NOTICE: this software is under active development. YMMV.

Introduction

Faster_lmm_d is a lightweight linear mixed-model solver for use in genome-wide association studies (GWAS). The original is similar to FaST-LMM (an algorithm by Lippert et al.) and that code base can be found here. Prof. Karl Broman wrote a comparison with his R/lmmlite. Faster_lmm_d and pylmm are part of the Genenetwork2 project. faster_lmm_d can parse data in R/qtl2 format as input.

GPU Support

Faster_lmm_d has two GPU backends:

CUDA backend which helps it directly interact with CUBLAS libraries and runs only on Nvidia Hardware. For CUDA backend, Faster_LMM_D uses cuda_d (The D bindings I wrote for CUDA libraries).

ArrayFire backend which helps it run on all major GPU vendors(Nvidia, Intel, AMD) by calling CUDA, CuBLAS, OpenCL, clBLAS libraries using the ArrayFire library. For ArrayFire backend, Faster_LMM_D uses arrayfire-d (The D bindings I wrote for ArrayFire library).

Install

Requirements

faster_lmm_d is written in the fast D language and requires a D compiler. At the moment we also use openblas (>0.2.19), lapacke, gsl and a bunch of D libraries that are installed with the dub tool.

On Debian/Ubuntu

Install

sudo apt-get install libopenblas liblapacke libgsl2 gfortran

Install LDC

sudo apt-get install ldc2

On GNU Guix

guix package -i ldc dub openblas gsl lapack ld-wrapper gcc glibc

Get the source

Get the source-code

git clone https://github.com/prasunanand/faster_lmm_d
cd faster_lmm_d

Fetch dependencies using the dub tool (on a non-CUDA system you may get errors which can be ignored). Currently the versions are fixated, see the Makefile.

dub

and compile

CPU Backend:

make

CUDA Backend:

make CUDA=1

ARRAYFIRE Backend:

make ARRAYFIRE=1

or in the case of GNU Guix (because dub does not honour the LIBRARY_PATH):

export LIBRARY_PATH=~/.guix-profile/lib
env LD_LIBRARY_PATH=$LIBRARY_PATH dub --compiler=ldc2

Usage example

./faster_lmm_d --control=data/genenetwork/BXD.json --pheno=data/genenetwork/104617_at.json --geno=data/genenetwork/BXD.csv --cmd=rqtl

Testing

To run tests

time ./run_tests.sh

If you get an error (on GNU Guix)

./build/faster_lmm_d: error while loading shared libraries: libgsl.so.19: cannot open shared object file: No such file or directory

try

time env LD_LIBRARY_PATH=$LIBRARY_PATH ./run_tests.sh

Performance Profiling

Install google-perftools and graphviz

sudo apt-get install google-perftools libgoogle-perftools-dev graphviz

Install go and then install google-pprof.

go get github.com/google/pprof

To profile uncomment out the code import gperftools_d.profiler;, ProfilerStart() and ProfilerStop() in the main function in source/faster_lmm_d/app.d.

make run-gperf

Useful links:

Poster on Faster Linear Mixed Models (LMM) for online GWAS omics analysis​ at Complex Trait Community Conference 2017, Memphis, Tennessee, USA.

LICENSE

This software is distributed under the GPL3 license.

Copyright © 2016 - 2018, Prasun Anand and Pjotr Prins