cumc / xqtl-protocol

Molecular QTL analysis protocol developed by ADSP Functional Genomics Consortium
https://cumc.github.io/xqtl-protocol/
MIT License
32 stars 41 forks source link
adsp fungen xqtl

FunGen-xQTL Computational Protocol

Developed for reproducible & reusable molecular QTL analyses for the NIH/NIA Alzheimer's Disease Sequencing Project (ADSP) Functional Genomics xQTL (FunGen-xQTL) Project.

QTL Diagram

Overview of the protocol

Standardized reference data

Reference data are standardized and curated by the ADSP FGC Standardization Workgroup in coordination with NIAGCADS. Please find reference data specifications on ADSP Dashboard.

Software environment

We have prepared containerized software environment through both Docker and Singularity virtualization systems to facilicate software environment setup and to aid in software reproducibility. For those not familiar with this concept please check out this wiki page of virtualization and an explanation on Docker website.

Pipeline execution

Pipelines in this repository are written in the Script of Scripts (SoS) workflow language. Like most other workflow languages, SoS workflows can distribute and execute computing jobs directly in High Performance Computing cluster. It can also use containers (Docker or Singularity) to help with setting up computational environment and improve reproducibility. Unlike most other workflow languages, SoS workflows are created using SoS Notebooks (based on Ipython Notebook and developed in Jupyter) which allow for both scientific narrative and pipeline scripts in the same document. Unlike typical Jupyter Notebooks intended for interactive data analysis, SoS workflows written in Jupyter Notebooks can be executed directly as command line scripts either on a local computer or in a HPC environment.

We provide this toy example for running SoS pipeline on a typical HPC cluster environment. First time users are encouraged to try it out in order to help setting up the computational environment necessary to run the analysis in this protocol.

Source code

How to use the resource

Organization of the resource

The website https://cumc.github.io/xqtl-protocol is generated from files under the code folder of the source code repository. The pipeline folder contains symbolic links automatically generated for pipeline files under code. The logic of the entire xQTL analysis workflow is roughly reflected on the left sidebar:

Computing environment setup

sos run pipeline/<pipeline_file>.ipynb

that is, executing the symbolic links directly to perform the analysis.

See Also

Our team

This repository is developed by the Analysis Working Group of the NIA FunGen-xQTL consortium.

Developers

Lead developers

Main contributors (largely based on GitHub Pull Requests)

Name Affiliation
Xuanhe Chen Department of Biostatistics, Columbia University
Wenhao Gou Department of Biostatistics, Columbia University
Liucheng Shi Department of Biostatistics, Columbia University
Haochen Sun Department of Biostatistics, Columbia University
Zining Qi Department of Biostatistics, Columbia University
Ru Feng Department of Neurology, Columbia University
Alexandre Pelletier Department of Medicine, Boston University
Travyse Edwards Mount Sinai & University of Pennsylvania
Daniel Nachun Department of Pathology, Stanford University
Jiacheng Li Department of Neurology, Columbia University
Mintao Lin Department of Medicine, Boston University

Leadership

FunGen-AD

Name Affiliation
Philip De Jager Department of Neurology, Columbia University
Carlos Crunchaga Department of Psychiatry, Neurology and Genetics, Washington University in St. Louis

FunGen-xQTL Analysis Working Group

Name Affiliation
Gao Wang Department of Neurology, Columbia University
Xiaoling Zhang Departments of Medicine and Biostatistics, Boston University
Edoardo Marcora Departments of Neuroscience, Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai
Fanny Leung Department of the Pathology and Laboratory Medicine, University of Pennsylvania
Julia TCW Department of Pharmacology and Bioinformatics, Boston University
Kushal K. Dey Memorial Sloan Kettering
Alan Renton Departments of Neuroscience, Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai
Stephen Montgomery Department of Pathology, Stanford University
Xiaoquan Wen Department of Biostatistics, University of Michigan