Vivianstats / MAAPER

Model-based analysis of APA using 3' end-linked reads
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-021-02429-5
6 stars 4 forks source link
alternative-polyadenylation bioinformatics-tool rna-seq

MAAPER: Model-based analysis of alternative polyadenylation using 3’ end-linked reads

Wei Vivian Li, Bin Tian 2021-08-13

Latest News

2021/08/13:

2021/06/15:

Introduction

MAAPER is a computational method for model-based analysis of alternative polyadenylation using 3’ end-linked reads. It uses a probabilistic model to predict polydenylation sites (PASs) for nearSite reads with high accuracy and sensitivity, and examines different types of alternative polyadenylation (APA) events, including those in 3’UTRs and introns, using carefully designed statistics.

Any suggestions on the package are welcome! For technical problems, please report to Issues. For suggestions and comments on the method, please contact Vivian (vivian.li@rutgers.edu).

Installation

You can install MAAPER from CRAN with:

install.packages("MAAPER")

Quick start

maaper requires three input files:

The final output of mapper are two text files named “gene.txt” and “pas.txt”, which contain the predicted PASs and APA results.

Below is a basic example which shows how to use the maaper function. The bam and gtf files used in this example can be downloaded here. To save computation time, we are providing a toy example dataset of chr19. In real data application, we do not recommend dividing the files into subsets by chromosomes.

library(MAAPER)

pas_annotation = readRDS("./mouse.PAS.mm9.rds")
gtf = "./gencode.mm9.chr19.gtf"
# bam file of condition 1 (could be a vector if there are multiple samples)
bam_c1 = "./NT_chr19_example.bam"
# bam file of condition 2 (could be a vector if there are multiple samples)
bam_c2 = "./AS_4h_chr19_example.bam"

maaper(gtf, # full path of the GTF file
       pas_annotation, # PAS annotation
       output_dir = "./", # output directory
       bam_c1, bam_c2, # full path of the BAM files
       read_len = 76, # read length
       ncores = 12  # number of cores used for parallel computation 
      )

Please note the following options in the mapper function:

Please refer to the package manual for a full list of arguments and detailed usage.

Citation

Li, W. V., Zheng, D., Wang, R., & Tian, B. (2021). MAAPER: model-based analysis of alternative polyadenylation using 3’end-linked reads. Genome Biology, in press. Link