Barcode demultiplexing with dorado 0.6.1

shair89 commented 1 month ago

When basecalling with Dorado 0.6.1 it is not successfully assigning barcode groups/demultiplexing using kit SQK-RBK114-24.

We have tested basecalling the same small POD5 file and have varying results:

With Dorado 0.6.1: Barcode 19 = 129 reads Barcode 20 = 188 reads Unclassified = 3124 reads

Dorado 0.5.3: Barcode 19 = 838 reads Barcode 20 = 1043 reads Unclassified = 1557

basecall_server-7.3.9 (MinKnow) Barcode19 = 907 Barcode 20 = 1116 Unclassified = 936

Small number of reads were assigned to other barcodes that weren't actually used in the experiment which varied between the tests.

We have tried the barcode classification using the basecaller command and the demux command separately (using --no-trim during basecalling) and obtained similar results.

Steps to reproduce the issue:

Please list any steps to reproduce the issue.

Run environment:

Dorado version: 0.6.1+79b5da5
Dorado command: dorado basecaller --kit-name SQK-RBK114-24 sup,5mCG_5hmCG ./pod5 > basecalled.bam dorado demux --output-dir ./demuxed/ --no-classify basecalled.bam
Operating system: Ubuntu 22.04.4 LTS 64bit
Hardware (CPUs, Memory, GPUs): Intel® Xeon(R) W-2255 CPU @ 3.70GHz × 20, 128GB RAM, NVIDIA QUADRO RTX 6000
Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): pod5
Source data location (on device or networked drive - NFS, etc.): device
Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB): R10, RBK114-24, 3446 reads, 1.7GB pod5

tijyojwad commented 1 month ago

Hi @shair89 - we are working on an urgent fix to the issue and will release a patch within the next day or so. Thank for your patience!

billytcl commented 1 month ago

@shair89 aside from v0.6.1 and earlier, we've also noticed a difference between using dorado and the basecall server! Interesting that you've seen the same.

@tijyojwad if we've already barcode classified with no-trim on basecaller, how should we "re-barcode classify" on the demux step? I'm guessing there's someway to override the old barcode call.

tijyojwad commented 1 month ago

@shair89 I forgot to update this thread! v0.6.2 was released with the patch fix for the low classification rate. Dorado is now at 0.7.0 which also contains the fix.

@billytcl - unfortunately for the RBK signal the --no-trim didn't apply. So you'll need to re-basecall to get the RBK improvements.

nanoporetech / dorado

Barcode demultiplexing with dorado 0.6.1 #800

Steps to reproduce the issue:

Run environment: