nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
525 stars 62 forks source link

demuxing with custom dual barcoding #1092

Closed nr0cinu closed 1 week ago

nr0cinu commented 2 weeks ago

Hi,

is it possible to perform demultiplexing with dorado using unique or combinatorial dual barcoding?

We have a custom barcoding set up, and we followed the documentation to set up a custom kit, however this outputs one file per barcode, and not one file per combination of barcodes.

Here is a minimal example of how our custom barcoding looks like:

where these are BARCODE_1 and BARCODE_2 following the schema: 5' --- ADAPTER/PRIMER --- BARCODE_1 --- TRAILING_FLANK_1 --- READ --- RC(TRAILING_FLANK_2) --- RC(BARCODE_2)--- 3'

Thanks! Bela

Dorado version: 0.8.1+c3a2952

Dorado command: dorado demux --kit-name BC --barcode-arrangement barcode_arrangement.toml --barcode-sequences barcode_sequences.fna --output-dir out --threads 4 --verbose --emit-fastq --emit-summary --barcode-both-ends --no-trim fastq_pass

malton-ont commented 2 weeks ago

Hi @nr0cinu,

In short, no, dorado does not support this.

A barcode may have different front and rear sequences, but both of those sequences are considered to be part of the same barcode, hence why you get one file. By reusing a barcode's front or rear sequence as the front or rear sequence of a different barcode you introduce an ambiguity and dorado will most likely mark this as unclassified as it is not sufficiently clear which barcode the found sequence belongs to.

nr0cinu commented 2 weeks ago

Hi Malton!

Thanks for the quick answer! :)

Any plans of implementing combinatorial/unique dual indexes in dorado?

By chance, can you give any recommendations for a alternative for demultiplexing then? Right now, I am leaning towards cutadapt, as it supports this kind of demultiplexing.

Thanks, Bela

malton-ont commented 2 weeks ago

Hi @nr0cinu,

We are looking at supporting combinatorial barcoding, but not really in the manner that you have described - we're looking more at two full barcodes (inner and outer) on a read rather than the front and rear being from different barcodes.

Our current recommendation would be to create two custom kits: one for the front barcode and one for the rear barcode (with the rear_only_barcodes = true set in the second kit), and perform two rounds of demultiplexing. You would then need to manually check that this has generated a permitted combination.

nr0cinu commented 1 week ago

Okay, thank you for the infos!