Closed vetmohit89 closed 5 months ago
Your previous issue, ArtRand specified that you are not correctly inputting the model. In your command you need to specify 'rna004_130bps_sup@v3.0.1', not 'sup'. So the command should be
dorado basecaller rna004_130bps_sup@v3.0.1 ./input_pod5_files/ --modified-bases m6A_DRACH --reference /reference_genome/ > ./test_m6a_3.bam
When I tried above command it is giving error:
terminate called after throwing an instance of 'std::runtime_error' what(): unknown simplex model rna004_130bps_sup@v3.0.1 Aborted
But When I tried dorado basecaller sup ./input_pod5_files/ --modified-bases m6A_DRACH --reference /reference_genome/ > ./test_m6a_3.bam
[2024-03-04 17:53:26.269] [warning] Unknown certs location for current distribution. If you hit download issues, use the envvar `SSL_CERT_FILE` to specify the location manually.
[2024-03-04 17:53:26.272] [info] - downloading rna004_130bps_sup@v3.0.1 with httplib
[2024-03-04 17:53:26.338] [error] Failed to download rna004_130bps_sup@v3.0.1: SSL server verification failed
[2024-03-04 17:53:26.338] [info] - downloading rna004_130bps_sup@v3.0.1 with curl
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 60.9M 100 60.9M 0 0 262M 0 --:--:-- --:--:-- --:--:-- 262M
[2024-03-04 17:53:27.647] [info] > Creating basecall pipeline
[2024-03-04 17:53:27.650] [info] - BAM format does not support `U`, so RNA output files will include `T` instead of `U` for all file types.
[2024-03-04 17:53:39.782] [info] - set batch size for cuda:0 to 1728
[ ] 0% [00m:00s<00m:00s]
[2024-03-04 17:54:15.898] [info] > Simplex reads basecalled: 17545
[2024-03-04 17:54:15.898] [info] > Simplex reads filtered: 46
[2024-03-04 17:54:15.898] [info] > Basecalled @ Samples/s: 1.581510e+07
[2024-03-04 17:54:15.929] [info] > Finished`
Hi @vetmohit89, Can you try both of the following please?
# Try using the full auto-model complex
dorado basecaller sup,m6A_DRACH input_pod5_files/ --reference /reference_genome/ > ./test_m6a_3.ba
Downloading specific models:
# Download the rna004 model
dorado download --model rna004_130bps_sup@v3.0.1
# Download the rna004 m6A_DRACH mods model
dorado download --model rna004_130bps_sup@v3.0.1_m6A_DRACH@v1
# call dorado with the specific model paths (note additional -models suffix |here | )
dorado basecaller rna004_130bps_sup@v3.0.1/ input_pod5_files/ --modified-bases-models rna004_130bps_sup@v3.0.1_m6A_DRACH@v1 --reference /reference_genome/ > ./test_m6a_3.bam
Kind regards, Rich
Hello Rich,
Following command works for me noe: dorado basecaller rna004_130bps_sup@v3.0.1 ./pod5/test/ \ --modified-bases m6A_DRACH \ --reference ./reference_genome_files/reference/ > ./test/test_m6a_3.bam.
I am wondering what other RNA modification I can identify with dorado?
Thank you Mohit
Hi @vetmohit89,
We've found the underlying issue in your original question where there were no mods in your output.
Mixing the model complex hac
and the --modified-bases
was incorrectly setting modification models in the pipeline.
Using either a complete model complex sup,m6A_DRACH
or a model path and --modified-bases
will work.
This will be fixed in the next release.
To view what modification models are available run:
dorado download --list
Kind regards, Rich
Hello, I have generated RNA004 in fast5 format and converted them to pod5. I am using following commands to create bam file:
dorado basecaller sup ./input_pod5_files/ --modified-bases m6A_DRACH --reference /reference_genome/ > ./test_m6a_3.bam
But this bam file is missing MM/ML/MN tags. when I using this bam file for m6a calling using modkit, I am getting error:
failed to get modbase info for record fc798da5-16eb-4fa4-9449-79013e3cbde6, Skipped: AUX data not found
Earlier, I thought maybe this issue maybe with modkit and shared a test datasample with @Art Rand in box (https://uab.box.com/s/g3g3nlg53jko2xqtwv89rzrnkkh2v6x2), He is able to generate the correct bam file with MM/ML/MN tags , followed by modkit.
I am using HPC. (Not a personal computer)
dorado --version 0.5.3+d9af343
dorado basecaller sup ./input_pod5_files/ --modified-bases m6A_DRACH --reference /reference_genome/ > ./test_m6a_3.bam
I am not sure if I am missing something in my dorado command? or is it because of any other reason. Please help me troubleshoot this error. I already create a similar issue on modkit issue. https://github.com/nanoporetech/modkit/issues/134