nanoporetech / rerio

Research release basecalling models and configurations
https://nanoporetech.com/
Other
103 stars 9 forks source link

4mC_5mC dorado model with 5khz data #61

Closed samuelmontgomery closed 6 months ago

samuelmontgomery commented 7 months ago

Hi,

I am just checking as some of the versions confuse me a little Can the res_dna_r10.4.1_e8.2_400bps_sup@v4.3.0_4mC_5mC@v1 model be used to basecall data generated at 5khz? The table suggests it should be used for 4khz data, but unless I am mistaken the v4.3.0 models are the latest canonical models for Dorado introduced in v0.5.0

Thanks,

Samuel

lxd98 commented 7 months ago

Hi,

I met the same confusion. I noticed that there is an update for the Dorado model (res_dna_r10.4.1_e8.2_400bps_sup@v4.3.0_4mC_5mC@v1) to basecall 4mC_5mC methylation, which was described to be used with 4kHz data. However, when I tried running the command for basecalling my 4kHz reads, I encountered errors. Could you please guide how to successfully run this model or which basecalling model I should use? Thank you!

# The command I used.
dorado basecaller \
    --reference ref/test.fa \
    --modified-bases-models models/4hz_methy/res_dna_r10.4.1_e8.2_400bps_sup@v4.3.0_4mC_5mC@v1 \
    dorado_models/dna_r10.4.1_e8.2_400bps_sup@v4.3.0 \
    1_data/pod5 > 2_methy/test_4mC_5mC.bam
# The error I got.
[error] Sample rate for model (5000) and data (4000) are not compatible.

Best regards, Raymond

marcus1487 commented 7 months ago

This is indeed a typo in the table of the README. I will fix this soon. This model is compatible with 5kHz data.

samuelmontgomery commented 6 months ago

Thanks - works a treat!