nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
545 stars 66 forks source link

Double demultiplexing for pcr barcoded samples #878

Open luigallucci opened 5 months ago

luigallucci commented 5 months ago

Hi all,

I'm using the barcoding kit SQK-NBD114.24 for sequencing on MinIon Flow cell. Before the library preparation and sequencing, I'm doing on each samples a pcr with barcoded primers: image

After the PCR, I proceeded to pool the samples and prepare the library on a single pool.

I'm receiving the output in pod5 divided by barcode, do you know how to proceed with the subsequential demultiplex step for getting my original samples?

I'm using the latest version of Dorado and MinKnow software.

HalfPhoton commented 5 months ago

Hi @luigallucci, I'm not sure I quite understand your issue. If you have multiple pod5s already split by barcode - then you can basecall each pod5 separately and get a demultiplexed bam output naturally.

Alternatively, you could basecall all the data and use dorado demux to split the output (.bam) generated by basecaller into a bam file per barcode.

dorado basecaller <model> <all_data> --kit-name <barcode kit> .... | dorado demux --no-classify --output-dir demux_output/

Kind regards, Rich

luigallucci commented 5 months ago

Hi @HalfPhoton, thank you for the reply!

I was probably not clear...after the pooling we proceeded with the normal barcoding and adapter protocol. So the structure is 10 samples pooled with 10 barcodes, and then barcode01 used for this pool. Same, for pool2 barcode02. I know that is not the classical approach. What I get from the MinKnow is just the pod5/fastq passing the basecalling, divided for the uplevel barcode. What I need, is to perform the demultiplex on an additional barcode level. The barcode which I'm using for this sub-level are similar to a Kit (first fwd listed above). Unlucky, no one in the lab is able to go bqck to the name of the kit, as they are coming from biomers even still ONT barcoded primers.