nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
495 stars 59 forks source link

Query on Handling Sense Reads Exclusively with dorado basecalling #769

Closed VahidJavaran closed 5 months ago

VahidJavaran commented 5 months ago

Hi Dorado Team, I am currently exploring the capabilities of the latest Dorado release for simultaneous detection of both 5mC and 6mA modifications. I aim to process these modifications together and have crafted the following command line for this purpose:

./dorado basecaller dna_r10.4.1_e8.2_400bps_sup@v4.3.0 ../pod5 --verbose --emit-moves --min-qscore 10 --trim all --modified-bases-models ../dna_r10.4.1_e8.2_400bps_sup@v4.3.0_6mA@v2,../dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mCG_5hmCG@v1 --modified-bases 5mC_5hmC 6mA --kit-name SQK-NBD114-96 --reference ref.fa.gz -Y > 6mA_5mC_5hmC.bam

While I have set my reference genome to align reads concurrently, my objective is to align only sense reads, excluding complement ones. Could you advise on how I might adjust my command to cater to this specific requirement? Alternatively, is there a post-processing step you would recommend to filter out complement reads from the Modbam file? I have searched and found this command also: samtools view -b -F 0x10 ../.bam > .../_sense_only.bam But I would like to do this step during the basecalling step.

I appreciate any guidance or suggestions you could offer on this matter.

tijyojwad commented 5 months ago

Hi - you can pipe the output of dorado basecalling into samtools and filter

./dorado basecaller dna_r10.4.1_e8.2_400bps_sup@v4.3.0 ../pod5 --verbose --emit-moves --min-qscore 10 --trim all --modified-bases-models ../dna_r10.4.1_e8.2_400bps_sup@v4.3.0_6mA@v2,../dna_r10.4.1_e8.2_400bps_sup@v4.3.0_5mCG_5hmCG@v1 --modified-bases 5mC_5hmC 6mA --kit-name SQK-NBD114-96 --reference ref.fa.gz -Y | samtools view -b -F 0x10 > .../_sense_only.bam
VahidJavaran commented 5 months ago

Thanks for your quick answer @tijyojwad !