COMBINE-lab / oarfish

long read RNA-seq quantification
BSD 3-Clause "New" or "Revised" License
72 stars 3 forks source link

BAM sorting for single-cell mode #31

Closed ChangqingW closed 3 months ago

ChangqingW commented 3 months ago

I was wondering if bam sorted by barcode (e.g. with samtools sort -t CB) is appropriate as input for oarfish when using the single-cell mode, because I found that when using the default name sorted BAM as suggested by the docs would not work.

rob-p commented 3 months ago

Hi @ChangqingW,

Thank you for this question, and for pointing out the fact that single-cell mode is (currently) missing from the docs. The short answer to your question is yes, for single-cell mode the BAM file should be sorted by the cell barcode (which for most upstream processing pipelines is the CB flag as you mention). This is mentioned in the 0.6.1 release notes, but not the docs, which we will update. The other current aspect to note about the single-cell mode is that it will count reads and will not currently make an attempt to count UMIs (if your reads have them).

We will add this relevant information to the documentation, but in the meantime, please let us know if you have any other questions.

Best, Rob