gagneurlab / FRASER-analysis

Accompanying analysis code for the FRASER manuscript
MIT License
25 stars 7 forks source link
outlier-detection r rare-disease rna-seq snakemake workflow


This is the accompanying analysis repository of the paper:

Detection of aberrant splicing events in RNA-seq data with FRASER.

The paper can be found on bioRxiv.

This repository contains the full pipeline and code to reproduce the results published in the paper using snakemake and wBuild.

Project structure

This project is setup as a wBuild workflow. This is an automatic build tool for R reports based on snakemake.

Data and prerequisites

This project depends on the python package wBuild and the R package FRASER. Further, we use the Leafcutter adaptation used in the Kremer et al paper, which can be found here.

The pipeline starts with the raw aligned GTEx samples V7P and their genotype calls, which can be downloaded from dbGaP. Since the data are not publicly shareable one has to apply for the data at dbGaP.

Repository setup

First download the repo and its dependencies:

# R package used throughout the workflow
git clone
git clone

# download needed SRA annotation db
wget -O - '' | gunzip -c > 'Data/filemapping/SRAmetadb.sqlite'

# analysis code
git clone
cd FRASER-analysis

and install wbuild using pip by running.

pip install wBuild
wBuild init

Since wBuild init will reset the current Snakefile,, and wbuild.yaml we have to revert them again with git.

git checkout Snakefile
git checkout wbuild.yaml
git checkout

To make sure all packages needed in the analysis are installed source the following file in R

Rscript ./src/r/install_dependencies.R

Run the full pipeline

To run the full pipeline, execute the following command with 10 jobs and maximum 40 cores in parallel:

# init datasets to be used
snakemake -j 25 --cores 25 defineDatasets

# run full analysis on datasets
snakemake -j 10 --cores 40 Output/paper_figures/supplement_final.pdf

or to run it on the cluster with SLUM installed:

snakemake -k --restart-times 2 --cluster "sbatch -N 1 -n 10 --mem 80G" --jobs 20