Developed for reproducible & reusable molecular QTL analyses for the NIH/NIA Alzheimer's Disease Sequencing Project (ADSP) Functional Genomics xQTL (FunGen-xQTL) Project.
Reference data are standardized and curated by the ADSP FGC Standardization Workgroup in coordination with NIAGCADS. Please find reference data specifications on ADSP Dashboard.
We have prepared containerized software environment through both Docker and Singularity virtualization systems to facilicate software environment setup and to aid in software reproducibility. For those not familiar with this concept please check out this wiki page of virtualization and an explanation on Docker website.
Pipelines in this repository are written in the Script of Scripts (SoS) workflow language. Like most other workflow languages, SoS workflows can distribute and execute computing jobs directly in High Performance Computing cluster. It can also use containers (Docker or Singularity) to help with setting up computational environment and improve reproducibility. Unlike most other workflow languages, SoS workflows are created using SoS Notebooks (based on Ipython Notebook and developed in Jupyter) which allow for both scientific narrative and pipeline scripts in the same document. Unlike typical Jupyter Notebooks intended for interactive data analysis, SoS workflows written in Jupyter Notebooks can be executed directly as command line scripts either on a local computer or in a HPC environment.
We provide this toy example for running SoS pipeline on a typical HPC cluster environment. First time users are encouraged to try it out in order to help setting up the computational environment necessary to run the analysis in this protocol.
The website https://cumc.github.io/xqtl-protocol is generated from files under the code
folder of the source code repository. The pipeline
folder contains symbolic links automatically generated for pipeline files under code.
The logic of the entire xQTL analysis workflow is roughly reflected on the left sidebar:
micromamba
).
Singularity
or Docker
. In the xQTL project we use Singularity
. Here are some tips to set up Singularity on MacOS.Singularity
installed. If not please communicate with the IT support for the HPC. Typically Docker is not allowed on HPC.Singularity
within WSL as instructed in this post.Singularity
container images in this Synapse folder. For guidance on downloading the data programmatically, refer to this documentation. If you need to set up a Synapse client, consult this guide.
test_data
folder, datasets prefixed with MWE (Minimal Working Example) are provided. These are used for unit testing each module, ensuring the integrity of the code.protocol_data
folder houses a comprehensive set of data, illustrating the full extent of our protocol. This is showcased in this notebook, with the source code available for reference.container/singularity
folder contains the released Singularity images for the software environment. For Docker users (e.g., on Linux or Mac Desktops), downloading this folder is not necessary.pipeline
folder. Users are encouraged to execute from the root of the repository folders by typing sos run pipeline/<pipeline_file>.ipynb
that is, executing the symbolic links directly to perform the analysis.
This repository is developed by the Analysis Working Group of the NIA FunGen-xQTL consortium.
Lead developers
Main contributors (largely based on GitHub Pull Requests)
Name | Affiliation |
---|---|
Xuanhe Chen | Department of Biostatistics, Columbia University |
Wenhao Gou | Department of Biostatistics, Columbia University |
Liucheng Shi | Department of Biostatistics, Columbia University |
Haochen Sun | Department of Biostatistics, Columbia University |
Zining Qi | Department of Biostatistics, Columbia University |
Ru Feng | Department of Neurology, Columbia University |
Alexandre Pelletier | Department of Medicine, Boston University |
Travyse Edwards | Mount Sinai & University of Pennsylvania |
Daniel Nachun | Department of Pathology, Stanford University |
Jiacheng Li | Department of Neurology, Columbia University |
Mintao Lin | Department of Medicine, Boston University |
FunGen-AD
Name | Affiliation |
---|---|
Philip De Jager | Department of Neurology, Columbia University |
Carlos Crunchaga | Department of Psychiatry, Neurology and Genetics, Washington University in St. Louis |
FunGen-xQTL Analysis Working Group
Name | Affiliation |
---|---|
Gao Wang | Department of Neurology, Columbia University |
Xiaoling Zhang | Departments of Medicine and Biostatistics, Boston University |
Edoardo Marcora | Departments of Neuroscience, Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai |
Fanny Leung | Department of the Pathology and Laboratory Medicine, University of Pennsylvania |
Julia TCW | Department of Pharmacology and Bioinformatics, Boston University |
Kushal K. Dey | Memorial Sloan Kettering |
Alan Renton | Departments of Neuroscience, Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai |
Stephen Montgomery | Department of Pathology, Stanford University |
Xiaoquan Wen | Department of Biostatistics, University of Michigan |