Closed xiangpingyu closed 2 months ago
HI @xiangpingyu,
m6A.bam
?-vv
?Kind regards, Rich
@xiangpingyu the eta for this job is just over 1 hour. Are you saying it's making no progress?
@HalfPhoton
I use the pod5.
the following as for your reference, with -vv
Running for ~5 hours, it is only 1% processed
$ dorado basecaller sup@v4.3.0,6mA@v2 ./ -x "cuda:all" --no-trim -r --kit-name SQK-NBD114-24 > 6ma_calls.bam
@iiSeymour it's my first time to run dorado in the system. I'm unsure that how long it will finish. Now, about five hours, the task only came up to ~1%, based on the previous information.
Hi @xiangpingyu can you run nvidia-smi
and report what the GPU utilization numbers look like? You can also install nvtop
utility which will show a nice utilization graph over time which can help us understand how well basecalling is utilizing the GPU.
Also, what is the expected read length distribution of your data?
Hi @xiangpingyu can you run
nvidia-smi
and report what the GPU utilization numbers look like? You can also installnvtop
utility which will show a nice utilization graph over time which can help us understand how well basecalling is utilizing the GPU.Also, what is the expected read length distribution of your data?
@tijyojwad the following is the output of "nvidia-smi" and the performance of GPU. The expected read length is in the range of 2.5k ~ 6kb.
@xiangpingyu Did it take 2 weeks for the base-calling to complete?
@xiangpingyu Do you have any updates on this issue?
Hi, I am also experiencing very slow performance when using Dorado basecaller from command line with the modified bases models (see command in the picture below):
The BAM file is 17100 MB at the moment, with only ~5% of the data basecalled after more than one day.... The total dataset is 2.2T obtained with P2 solo.
Find also a screenshot of the nvidia-smi
Any suggestion to speed up the process? When I do the basecalling from MinKnow it is way faster, but then I can only choose one model for the modified bases, and in this case I need to detect 6mA, 4mC, and 5mC all together.
Many thanks in advance! Axel
The v5 sup models are are under active development to improve basecalling performance. They use a new architecture which has yet to be optimised fully. Combining this and detecting 3 mods is a heavy computational load and as such the basecalling performance is slow.
Minknow is much quicker because it will be using v4.3 models which uses the old architecture which has had a large amount of effort put into its performance.
Dear developers,
The size of one dataset is about 80.0GB, and it's stored on a local disk (when running, GPU memory = 34.3/55.8 GB) the following is the CPU and GPU info:
CPU: 13th Gen Intel® core i9-13900K, Cores=24, logical processors=32
GPU: NVIDIA GeForce RTX 4090, CUDA cores=16384
This is the command I am using
dorado basecaller sup,6mA --no-trim --recursive ./ > m6A.bam
I am unsure when the process will complete. Is there anything I need to modify?
Thank you!