YeoLab / rbp-maps

splicing and feature maps for RBPs
22 stars 10 forks source link

RBP Maps

RBP splice and feature maps

This has been tested on (requirements):

Module Version
pandas 0.20.1
pybedtools 0.7.8
bedtools 2.26.0
pysam 0.8.4
samtools 1.3.1
pyBigWig 0.3.5
matplotlib 2.0.2
seaborn 0.8
jupyter 4.2.0 (if you want to import)
cwltool 1.0.20170828135420 (if you want to use as a CWL tool)
tqdm 4.19.5
numpy 1.12.1
scipy 0.19.1

Installation:

Create the environment:

git clone https://github.com/yeolab/rbp-maps
cd rbp-maps;
conda env create -f conda_env.txt -n rbp-maps
source activate rbp-maps

Then, install:

cd rbp-maps;
python setup.py build
python setup.py install

Docker:

docker pull brianyee/rbp-maps

Usage:

Plotting density (*.bw files from the eCLIP bioinformatics pipeline)

plot_map --ip ip.bam \ # BAM file containing reads of your CLIp (make sure the .pos.bw and .neg.bw files are in this directory)
 --ip_pos_bw \ # positive bigwig file for CLIp
 --ip_neg_bw \ # negative bigwig file for CLIp
 --input input.bam \ # BAM file containing reads for size matched input (make sure the .pos.bw and .neg.bw files are in this directory)
 --input_pos_bw \ # positive bigwig file for INPUT
 --input_neg_bw \ # negative bigwig file for INPUT
 --annotations rmats_annotation1.JunctionCountOnly.txt rmats_annotation2.JunctionCountOnly.txt rmats_annotation3.JunctionCountOnly.txt \ # annotation files
 --annotation_type rmats rmats rmats \ # specifies the type of file for each of the above annotations (either 'rmats' or 'miso' options are supported)
 --output rbfox2.svg \ # either an 'svg' or 'png' file works
 --event se \ # can be either: 'se' (skipped exons), 'a3ss' (alternative 3' splice site), or 'a5ss' (alternative 5' splice site)
 --normalization_level 1 \ # numeric "code" used to determine the kind of normalization to output (see below)
 --testnums 0 1 \
 --bgnum 2 \
 --sigtest permutation

Plotting peaks (*.compressed.bed files from the eCLIP bioinformatics pipeline)

plot_map --peak peak.bb \  # peaks file as a bigbed
 --annotations rmats_annotation1.JunctionCountOnly.txt rmats_annotation2.JunctionCountOnly.txt rmats_annotation3.JunctionCountOnly.txt \ # annotation files
 --annotation_type rmats rmats rmats \ # specifies the type of file for each of the above annotations (either 'rmats' or 'miso' options are supported)
 --output rbfox2.svg \ # either an 'svg' or 'png' file works
 --event se # can be either: 'se' (skipped exons), 'a3ss' (alternative 3' splice site), or 'a5ss' (alternative 5' splice site)
 --normalization_level 0 \ # numeric "code" used to determine the kind of normalization to output (see below)
 --testnums 0 1 \
 --bgnum 2 \
 --sigtest fisher

Using a background & calculating significance.

In our above example, we've set a few optional parameters that you can set to determine significance given an optional background dataset.

Links to files

You can refer to the 'examples/' directory for usage. These examples refer to BAM and BigWig files that can be downloaded from encodeproject.org

We also provide the script used to raw rMATS (hg19) outputs (based on inclusion junction count as described in paper). Here is an example commandline for filtering SE events from a file "SE.MATS.JunctionCountOnly.txt":

subset_jxc -i SE.MATS.JunctionCountOnly.txt \
-o SE.MATS.JunctionCountOnly.nr.txt \
-e se
Other Options

--exon_offset: (untested) controls how many bases into an exon you would like to plot (default 50 bases)

--intron_offset: (untested) controls how many bases into an intron you would like to plot (default 300 bases)

--confidence: For each position, keep only this fraction of events to reduce noise caused by outliers (default 0.95)

Example Outputs

Skipped Exon

skippedexon

Alternative 3' Splice Sites

alt3prime

Alternative 5' Splice Sites

alt5prime

Retained Intron

retained

Intermediate files produced

The program will try and create as many intermediate files so you can do more downstream analysis, or plot your own maps, and things.

Other Notes

Publication

Alt Text