nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
493 stars 59 forks source link

Trimming barcodes behaviour with demux --no-trim #835

Closed francesco227 closed 3 weeks ago

francesco227 commented 4 months ago

Issue Report

No trim behaviour using Dorado Demux

Please describe the issue:

Hi! I'm using dorado demux to demultiplexing my reads after using the kit SQK-NBD114-24 and after using dorado basecalling. When I use the option --no-trim (both in combination with --no-classify or with the name of the kit) I would expect to still have the barcodes in the reads. After demultiplexing, I obtain a .bam file and using the command samtools fastq I convert the .bam to a fastq. The problem is that the reads are exactly the same, and I never find the barcode, either I use the --no-trim option or I don't.

Thanks a lot!!!! :-) Francesco

Run environment:

tijyojwad commented 4 months ago

Hi, can you share your basecalling command?

francesco227 commented 4 months ago

~/miniconda3/bin/dorado-0.5.3-linux-x64/bin/dorado basecaller hac Yeast-ITR_3-pod5/ --kit-name SQK-NBD114-24 --barcode-both-ends > calls_hac.bam

(the basecalling and the demultiplexing were done on different computers.) in this case dorado version is: 0.5.3 and ubuntu version: 20.04

thanks!!! maybe there is something im missing :-)

tijyojwad commented 4 months ago

Hi @francesco227 - since you've specified barcoding during basecalling, the barcodes will be trimmed by default (as mentioned here). However each read should already be classified.

You can run dorado demux --no-classify calls_hac_single_end.bam --output-dir classified_demux_single_bc split the basecalled BAM into per barcode BAMs.

If you want to do barcode classification after basecalling, you can remove the --kit-name option from the dorado basecaller command.