COMBINE-lab / salmon

🐟 🍣 🍱 Highly-accurate & wicked fast transcript-level quantification from RNA-seq reads using selective alignment
https://combine-lab.github.io/salmon
GNU General Public License v3.0
777 stars 165 forks source link

Salmon alevin errors out when cell barcode length is > 31 base pairs #943

Open connersk opened 4 months ago

connersk commented 4 months ago

tl;dr

Summary

This bug primarily related to alevin (single-cell mode).

Describe the bug

To Reproduce

salmon alevin
    -i /path/to/salmon_index 
    -p 16 
     -l ISR
    --read-geometry 2[1-100]        
    --bc-geometry 1[1-34]         
    --umi-geometry 1[35-44]         
    --sketch         
    -1 /path/to/r1.fastq 
    -2 /path/to/r2.fastqs
    -o /output/path        
     --tgMap /path/to/t2g.tsv

Steps and data to reproduce the behavior:

Specifically, please provide at least the following information:

Expected behavior Salmon alevin does not error out when the cell barcode length is 34 base pairs.

Screenshots Error log [2024-06-17 22:00:25.466] [alevinLog] [error] Barcode length (34) was not in the required length range [1, 31].

Desktop (please complete the following information):

connersk commented 3 months ago

Following up on this, if I was to fork salmon and fix this, would the correct solution be to update the following line? https://github.com/COMBINE-lab/salmon/blob/a2f6912b3f9f9af91e3a4b0d74adcb3bdc4c9a32/src/AlevinUtils.cpp#L1204