COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
776 stars 164 forks source link
10x bioinformatics c-plus-plus gene-expression quantification quasi-mapping rna-seq rna-seq-quantification rnaseq sailfish salmon scrna-seq selective-alignment single-cell single-cell-rna-seq transcriptome
salmon logo

Documentation Status install with bioconda GitHub tag (latest SemVer)

Try out the new alevin-fry framework for single-cell analysis; tutorials can be found here!

Help guide the development of Salmon, take our survey

What is Salmon?

Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data. Salmon achieves its accuracy and speed via a number of different innovations, including the use of selective-alignment (accurate but fast-to-compute proxies for traditional read alignments), and massively-parallel stochastic collapsed variational inference. The result is a versatile tool that fits nicely into many different pipelines. For example, you can choose to make use of our selective-alignment algorithm by providing Salmon with raw sequencing reads, or, if it is more convenient, you can provide Salmon with regular alignments (e.g. an unsorted BAM file with alignments to the transcriptome produced with your favorite aligner), and it will use the same wicked-fast, state-of-the-art inference algorithm to estimate transcript-level abundances for your experiment.

Give salmon a try! You can find the latest binary releases here.

The current version number of the master branch of Salmon can be found here

Documentation

The documentation for Salmon is available on ReadTheDocs, check it out here.

Salmon is, and will continue to be, freely and actively supported on a best-effort basis. If you need industrial-grade technical support, please consider the options at oceangenomics.com/contact.

Decoy sequences in transcriptomes

tl;dr: fast is good but fast and accurate is better! Alignment and mapping methodology influence transcript abundance estimation, and accounting for the accounting for fragments of unexpected origin can improve transcript quantification. To this end, salmon provides the ability to index both the transcriptome as well as decoy seuqence that can be considered during mapping and quantification. The decoy sequence accounts for reads that might otherwise be (spuriously) attributed to some annotated transcript. This tutorial provides a step-by-step guide on how to efficiently index the reference transcriptome and genome to produce a decoy-aware index. Specifically, there are 3 possible ways in which the salmon index can be created:

Facing problems with Indexing?, Check if anyone else already had this problem in the issues section or fill the index generation request form

NOTE:

If you are generating an index to be used for single-cell or single-nucleus quantification with alevin-fry, then we recommend you consider building a spliced+intron (splici) reference. This serves much of the purpose of a decoy-aware index when quantifying with alevin-fry, while also providing the capability to attribute splicing status to mapped fragments. More details about the splici reference and the Unspliced/Spliced/Ambiguous quantification mode it enables can be found here.

Chat live about Salmon

You can chat with the Salmon developers and other users via Gitter (Note: Gitter is much less frequently monitored than GitHub, so if you have an important problem or question, please consider opening an issue here on GitHub)!

Join the chat at https://gitter.im/COMBINE-lab/salmon