Please describe the issue: I was using Tombo for our bacterial Data bevor it was deprecated. I was suggested that i use dorado from now on. So i did. Now i got the issue that with Tombo i had way more C sites covered then with dorado. In my Reference i have 100.658 Cs and in my .bam file from which i made a .bed file with modbam2bed has only 16.447 Cs covered. Since i am analyzing the methylation of Cs this is quite an issue for me. So now i wanted to ask if someone could tell me if i am doing something wrong or if the sequencing just went really bad (i am not a lab scientist). So a lot of Cs which were covered with tombo are now not being covered with dorado. Also does my .Bed file have Positions for 5mC where no C is occuring in the reference.
Please provide a clear and concise description of the issue you are seeing and the result you expect.
Steps to reproduce the issue: I will just write my code i used to generate the files. I used this dorado call :
I also will add the Modbam2bed script maybe it helps to understand my Problem:
for mod_type in 5mC 6mA; do
./modbam2bed -e -m "$mod_type" -t 5 "$reference_file" "$dir" > "/gpfs/project/azlan/MycoR10/${Name}${mod_type}.bed"
done
I used the same reference file as for dorado.
SP10291_R10IIPC_2n.fasta.gz
Source data type (e.g., pod5 or fast5 - please note we always recommend converting to pod5 for optimal basecalling performance): i cant upload them they are to big
Source data location (on device or networked drive - NFS, etc.):
Details about data (flow cell, kit, read lengths, number of reads, total dataset size in MB/GB/TB):
Dataset to reproduce, if applicable (small subset of data to share as a pod5 to reproduce the issue):
Logs
Please provide output trace of dorado (run dorado with -v, or -vv on a small subset)
Issue Report
Please describe the issue: I was using Tombo for our bacterial Data bevor it was deprecated. I was suggested that i use dorado from now on. So i did. Now i got the issue that with Tombo i had way more C sites covered then with dorado. In my Reference i have 100.658 Cs and in my .bam file from which i made a .bed file with modbam2bed has only 16.447 Cs covered. Since i am analyzing the methylation of Cs this is quite an issue for me. So now i wanted to ask if someone could tell me if i am doing something wrong or if the sequencing just went really bad (i am not a lab scientist). So a lot of Cs which were covered with tombo are now not being covered with dorado. Also does my .Bed file have Positions for 5mC where no C is occuring in the reference.
Please provide a clear and concise description of the issue you are seeing and the result you expect.
Steps to reproduce the issue: I will just write my code i used to generate the files. I used this dorado call :
done
I also will add the Modbam2bed script maybe it helps to understand my Problem: for mod_type in 5mC 6mA; do ./modbam2bed -e -m "$mod_type" -t 5 "$reference_file" "$dir" > "/gpfs/project/azlan/MycoR10/${Name}${mod_type}.bed" done I used the same reference file as for dorado. SP10291_R10IIPC_2n.fasta.gz
Please list any steps to reproduce the issue.
Run environment:
Logs