thelovelab / fishpond

Differential expression and allelic analysis, nonparametric statistics
https://thelovelab.github.io/fishpond
27 stars 9 forks source link
bioconductor gene-expression genomics rnaseq salmon scrnaseq statistics transcriptomics

fishpond

R build status

Fishpond: downstream methods and tools for expression data

Fishpond contains a method, swish(), for differential transcript and gene expression analysis of RNA-seq data using inferential replicates. Also the package contains utilities for working with Salmon, alevin, and alevin-fry quantification data, including loadFry().

Quick start

The following paradigm is used for running a Swish analysis:

y <- tximeta(coldata) # reads in counts and inf reps
y <- scaleInfReps(y) # scales counts
y <- labelKeep(y) # labels features to keep
set.seed(1) # for reproducibility
y <- swish(y, x="condition") # simplest Swish case

How does Swish work

Swish accounts for inferential uncertainty in expression estimates by averaging test statistics over a number of inferential replicate datasets, either posterior samples or bootstrap samples. This is inspired by a method called SAMseq, hence we named our method Swish, for "SAMseq With Inferential Samples Helps". Averaging over inferential replicates produces a different test statistic than what one would obtain using only point estimates for expression level.

For example, one of the tests possible with swish() is a correlation test of expression level over a condition variable. We can visualize the distribution of inferential replicates with plotInfReps():

The test statistic is formed by averaging over these sets of data:

p-values and q-values are computed through permutation of samples (see vignette for details on permutation schemes).

The Swish method is described in the following publication:

Zhu, A., Srivastava, A., Ibrahim, J.G., Patro, R., Love, M.I. "Nonparametric expression analysis using inferential replicate counts" Nucleic Acids Research (2019) 47(18):e105 PMC6765120

The SEESAW method for allelic expression analysis is described in the following preprint:

Euphy Wu, Noor P. Singh, Kwangbom Choi, Mohsen Zakeri, Matthew Vincent, Gary A. Churchill, Cheryl L. Ackert-Bicknell, Rob Patro, Michael I. Love. "Detecting isoform-level allelic imbalance accounting for inferential uncertainty" bioRxiv (2022) doi: 10.1101/2022.08.12.503785

Installation

This package can be installed via Bioconductor:

BiocManager::install("fishpond")

Funding

This work was funded by NIH NHGRI R01-HG009937.