Closed yul96 closed 6 months ago
Is it possible that you are using some kind of modified base in your IVT? Some modified bases are often miscalled as C's.
@VBHerrenC I think you are right, I spoke to the team and they used modified bases. thanks
Thanks for contributing to this issue @VBHerrenC.
Issue Report
Please describe the issue:
We found a lot of mutations in the base called RNA reads using Dorado 0.5.3, the RNA is the product of in vitro transcription and the reads are mapped back to the template to evaluate mutations. There are a lot of non-random mutations in the reads. Is this expected for the current Dorado basecaller?
We ran six independent samples and these mutations are not expected.
Steps to reproduce the issue:
mRNA is produced by in vitro T7 transcription
direct RNA sequencing is used to generate the raw data according to the manual (SQK-RNA004)
dorado 0.5.3 is used to do the base call, command as below:
...../dorado-0.5.3-linux-x64/bin/dorado basecaller --estimate-poly-a --verbose ...../dorado-0.5.3-linux-x64/bin/rna004_130bps_sup@v3.0.1 pod5 > dorado.calls.sam
Run environment:
Logs