Closed Ishrektd closed 2 months ago
Hi @Ishrektd,
As you have already barcoded the reads during basecalling, you should use the --no-classify
option with dorado demux
instead of specifying the --kit-name
. This is because the barcodes have been trimmed during the first step, so there are now no barcodes to identify in the second step.
I see, thank you! I ended up redoing it with simple basecalling before your comment:
dorado basecaller sup pod5s --no-trim > calls.bam
After this, I did demuxing using the following flags:
dorado demux --kit-name --barcode-both-ends --sample-sheet --emit-fastq --output-dir calls.bam
In this case, the barcodes, adapters, and primers would not have been trimmed during basecalling, but the demultiplexing step should trim adapters/primers/barcodes... would this process be correct?
@Ishrektd,
dorado demux
will trim barcodes but it does not perform additional trimming of adapters/primers, so if no barcode is found (or if these are outboard of the adapters!) then the adapters/primers will not be trimmed. If this is an issue for you, follow the suggestion I gave above:
dorado basecaller sup pod5s --kit-name $kitname --barcode-both-ends --sample-sheet $sample > calls.bam
dorado demux --no-classify --emit-fastq --output-dir $demuxdir calls.bam
Thank you for your help! I will try this out 🙂
Another option would be to do as you had, but add a call to dorado trim
for each demuxed file.
I am attempting to basecall using dorado, and I had a question regarding which kit I should provide as part of their
--kit-name
flag when usingdorado basecaller
.In my library preparation, I used SQK-LSK110 chemistry with the PBC-096 Barcoding Kit to sequence a pooled library of 16S amplicons. Sequencing was done on a MinION Flow Cell R9.4.1 and the resulting output was in
.pod5
format.When attempting to basecall, I used the
dna_r9.4.1_e8_sup@v3.6
model and for my kit, I providedEXP-PBC096
as that's the barcoding kit I used in my experiment. However, although this did work, I'm unsure if this is correct because when setting the parameters to trim all adapters/primers/barcodes by default and then demultiplexing, the resultingunclassified.bam
file was 7.7G (total output was 7.8G).To my knowledge, the kit-name should refer to the barcoding kit, and the chemistry should be indicated by the basecalling model selected (in this case, e8 refers to SQK-LSK110 for flow cell r9.4.1).
In this case, would my process be correct?
For reference, here is my full code: