hzi-bifo / RiboDetector

Accurate and rapid RiboRNA sequences Detector based on deep learning
GNU General Public License v3.0
94 stars 16 forks source link

Can we use Ribodetector for rRNA quantification in scRNASeq? #41

Closed Rohit-Satyam closed 10 months ago

Rohit-Satyam commented 10 months ago

I have UMI based (3' chemistry) scRNAseq. The R1 file therefore contains 28bp long UMI+barcodes and R2 file contains the actual reads. Since Ribodetector objects anything below 30bp and warns user that the prediction of rRNA reads might not be accurate, I was feeling reluctant to use both R1 and R fastq files. But if I run it just on R2, how will I be albe to subset R1?

Do you plan to add support for scRNAseq data in future?

dawnmy commented 10 months ago

Thank you for your suggestion regarding scRNA support. I will consider its inclusion in future updates.

For now, you can run it on R2. After this, you can retrieve the read ID of the predicted rRNA reads using:

seqkit seq -ni <your rRNA fastq from R2>  -o id.txt

This will generate an id.txt file. Once you have this list of read IDs, you can then fetch the corresponding records from your R1 fastq file using:

seqkit grep -f id.txt <R1 fastq file> -o <filtered R1>

Of course you need to install seqkit with conda first.

I hope this helps, and please keep us posted with any further queries or feedback!