nanoporetech / dorado

Oxford Nanopore's Basecaller
https://nanoporetech.com/
Other
493 stars 59 forks source link

No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002' #821

Closed dar19 closed 4 months ago

dar19 commented 4 months ago

Issue Report

No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002'

Please describe the issue:

I am trying to run dorado on previous direct RNAseq libraries but am facing issue of chemistries not found (despite the fact that RNA002 are automatically recognized as per dorado :: "Adapters for RNA002 and RNA004 kits are automatically trimmed during basecalling. However, unlike in DNA, the RNA adapter cannot be trimmed post-basecalling.")

The above libraries were based on experimental kit was "SQK-RNA002" and flowcell is "FLO-MIN106" {the same pipeline below worked on kit -->SQK-RNA004 and Flowcell -->FLO-PRO004RA}

What am I missing here ?

Logs that I get are:

[2024-05-20 22:28:22.644] [info] Running: "basecaller" "--device" "cuda:all" "--recursive" "fast" "./samples_pod5/20200911-samplename_polyA_5P_1/pod5files/"
[2024-05-20 22:28:22.832] [error] No supported chemistry found for flowcell_code: 'FLO-MIN106' sequencing_kit: 'SQK-RNA002' sample_rate: 3012
[2024-05-20 22:28:22.832] [error] This is typically seen when using prototype kits. Please download an appropriate model for your data and select it by model path
[2024-05-20 22:28:22.833] [error] Could not resolve chemistry from data: Unknown chemistry_

Steps to reproduce the issue. Step 1: Convert the fast5 files to pod5 files:

pod5 convert fast5 "$filepath" \
    --threads $threads \
    --output "$pod5_opath/$filepath.pod5"

[above step Worked fine]

Step 2: Use pod5 files created above to base call for polyA tail estimation

dorado basecaller \
    --device cuda:all \
    --estimate-poly-a \
    --recursive \
    fast \
    $pod5_path/ > reads.w_polyA.bam

[outputs error - shown in logs above]

Run environment:

dorado --version [2024-05-20 22:40:23.496] [info] Running: "--version" 0.6.0+7a6ab9a

malton-ont commented 4 months ago

Hi @dar19,

The issue here appears to be that your data was recorded at 3012Hz, but dorado expects that model to have a sampling rate of 3000Hz - if I recall correctly this is down to a difference between minion and gridion/promethion sampling frequencies for rna002 data.

We'll look into addressing this incompatibility in a future release, but in the meantime you can specify the correct model manually (rna002_70bps_fast@v3).

dar19 commented 4 months ago

Thank you. rna002_70bps_fast@v3 solved the issue.

AteeshaNegi commented 3 months ago

Hello, I encountered an issue while attempting to use Dorado for basecalling. As suggested I used the rna002_70bps_fast@v3 model.

Here is the command I used : dorado basecaller rna002_70bps_fast@v3 ~/Documents/Nano_Seq/test/pod5/ > Arg_reads.bam

Error Message:

[2024-07-02 19:49:27.471] [info] Running: "basecaller" "rna002_70bps_fast@v3" "/home/hou_lab/Documents/Nano_Seq/test/pod5/" terminate called after throwing an instance of 'std::runtime_error' what(): toml::parse: file open error -> rna002_70bps_fast@v3/config.toml"

How do resolve this issue? Thanks

malton-ont commented 3 months ago

@AteeshaNegi,

You need to download the model and specify the path to it. If only stating the model name, it needs to be in the local working directory.

dorado download --model rna002_70bps_fast@v3 --directory <download-directory>
dorado basecaller <download-directory>/rna002_70bps_fast@v3 ~/Documents/Nano_Seq/test/pod5/ > Arg_reads.bam
AteeshaNegi commented 3 months ago

Thank you, that resolved the issue.