COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
777 stars 165 forks source link

Using Alevin for other 3' amplified scRNA-Seq platforms #311

Closed mojakab closed 5 years ago

mojakab commented 6 years ago

Dear alevin team, I was wondering, whether alevin is applicable to other scRNA-Seq platforms that are not droplet-based, but 3' amplified, like CEL-Seq2. My sequencing data consists of two fastq files, R1 and R2. R1 is 13 bp and contains the barcode 1-6 bp, as well as UMI 7 -12bp. R2 is 50 bp and contains the sequence. Thanks so much for your help, best Moritz

k3yavi commented 6 years ago

Hi @mojakab , Thanks for your interest in Alevin. Currently most of our research efforts have gone into developing Alevin for droplet based 3' tag sequencing like 10x chromium and DropSeq. Although similar but Cel-Seq2 relies on a different cell isolation step which can potentially create assay specific bias between the experiments. Basically Alevin is designed to work with single-cell protocols which follows the following criteria:

We have similar such request in https://github.com/COMBINE-lab/salmon/issues/269, where the user was able to use Alevin with Cel-Seq2 but currently we have not explored the full potential of Alevin with Cel-Seq2 and might require more careful consideration. If you happen to use Alevin on Cel-Seq2 data we'd appreciate your feedback based on your experience.

k3yavi commented 5 years ago

Keeping in mind using Alevin with CelSeq is still exploratory but would be useful for the users, we have added --celseq2 flag to Alevin which assumes read_1 has first 6 bases as CB and the next 6 bases as the UMI and the read_2 as the read sequence. The command line parameters work normally as using with 10x data just the flag --chromium has to be swapped with --celseq2.

rbenel commented 5 years ago

Hi @k3yavi, I just re-read this post and I believe that in the CEL-Seq2 protocol, read_1 has first the UMI and then the CB and then polyT... because the sequencing starts with the Illumina adapter (see image below from paper).

Thanks! 13059_2016_938_fig1_html