COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
777 stars 165 forks source link

Alevin: custom fastQ R1 #494

Closed davidecrs closed 4 years ago

davidecrs commented 4 years ago

Hi,

After my data preprocessing, I have a custom fastQ R1 composed just with the BC and the UMI. My BC has length 18nt and the UMI has length 8nt. So the sequence in R1 has total length 26nt. This is an example:

@NB501072:248:HNKTFBGXC:1:11101:8054:1048:TGCGTNATAGTAGCCATCGCATTGCGCGNATTACCTCTGAGCNGAAAGTAAAACGACGNTTAGGACTT 1
TAATAGGCGAATAGTAAAACGNTTAG
+
A#EEEEEEE#EEEEEEEEEEE#EEEE

I was wondering if there is a specific way to use Alevin with this kind of data. This is because I had to process raw data from Illumina / BioRad ddSeq.

k3yavi commented 4 years ago

Yep use --end 5 --umiLength 8 --barcodeLength 18

davidecrs commented 4 years ago

Great ! Thank you !!!

davidecrs commented 4 years ago

Yep use --end 5 --umiLength 8 --barcodeLength 18

Thanks again, I forgot, however, to specify that the sequence is composed first of BC and then of UMI (BC + UMI) (I'm not sure if it was clear in the issue). Does the command remain the same?

k3yavi commented 4 years ago

Yep that's the expected order for the command . If the umi is first you might have to change --end 3.