nanoporetech / dorado

Oxford Nanopore's Basecaller
493 stars 59 forks source link

No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002' #821

Closed dar19 closed 4 months ago

dar19 commented 4 months ago

Issue Report

No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002'

Please describe the issue:

I am trying to run dorado on previous direct RNAseq libraries but am facing issue of chemistries not found (despite the fact that RNA002 are automatically recognized as per dorado :: "Adapters for RNA002 and RNA004 kits are automatically trimmed during basecalling. However, unlike in DNA, the RNA adapter cannot be trimmed post-basecalling.")

The above libraries were based on experimental kit was "SQK-RNA002" and flowcell is "FLO-MIN106" {the same pipeline below worked on kit -->SQK-RNA004 and Flowcell -->FLO-PRO004RA}

What am I missing here ?

Logs that I get are:

[2024-05-20 22:28:22.644] [info] Running: "basecaller" "--device" "cuda:all" "--recursive" "fast" "./samples_pod5/20200911-samplename_polyA_5P_1/pod5files/"
[2024-05-20 22:28:22.832] [error] No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002' sample_rate: 3012
[2024-05-20 22:28:22.832] [error] This is typically seen when using prototype kits. Please download an appropriate model for your data and select it by model path
[2024-05-20 22:28:22.833] [error] Could not resolve chemistry from data: Unknown chemistry_

Steps to reproduce the issue. Step 1: Convert the fast5 files to pod5 files:

pod5 convert fast5 "$filepath" \
    --threads $threads \
    --output "$pod5_opath/$filepath.pod5"

[above step Worked fine]

Step 2: Use pod5 files created above to base call for polyA tail estimation

dorado basecaller \
    --device cuda:all \
    --estimate-poly-a \
    --recursive \
    fast \
    $pod5_path/ > reads.w_polyA.bam

[outputs error - shown in logs above]

Run environment:

dorado --version [2024-05-20 22:40:23.496] [info] Running: "--version" 0.6.0+7a6ab9a

malton-ont commented 4 months ago

Hi @dar19,

The issue here appears to be that your data was recorded at 3012Hz, but dorado expects that model to have a sampling rate of 3000Hz - if I recall correctly this is down to a difference between minion and gridion/promethion sampling frequencies for rna002 data.

We'll look into addressing this incompatibility in a future release, but in the meantime you can specify the correct model manually (rna002_70bps_fast@v3).

dar19 commented 4 months ago

Thank you. rna002_70bps_fast@v3 solved the issue.

AteeshaNegi commented 3 months ago

Hello, I encountered an issue while attempting to use Dorado for basecalling. As suggested I used the rna002_70bps_fast@v3 model.

Here is the command I used : dorado basecaller rna002_70bps_fast@v3 ~/Documents/Nano_Seq/test/pod5/ > Arg_reads.bam

Error Message:

[2024-07-02 19:49:27.471] [info] Running: "basecaller" "rna002_70bps_fast@v3" "/home/hou_lab/Documents/Nano_Seq/test/pod5/" terminate called after throwing an instance of 'std::runtime_error' what(): toml::parse: file open error -> rna002_70bps_fast@v3/config.toml"

How do resolve this issue? Thanks

malton-ont commented 3 months ago


You need to download the model and specify the path to it. If only stating the model name, it needs to be in the local working directory.

dorado download --model rna002_70bps_fast@v3 --directory <download-directory>
dorado basecaller <download-directory>/rna002_70bps_fast@v3 ~/Documents/Nano_Seq/test/pod5/ > Arg_reads.bam
AteeshaNegi commented 3 months ago

Thank you, that resolved the issue.