Closed tzcoolman closed 1 week ago
Hi @tzcoolman,
You're setting --device cpu
in your example, are you sure this wasn't active when testing on the GPU device?
Kind regards, Rich
yes. like I described in my thread, I also tried GPU (224 cores) option but didn't get significantly better speed. How many cores do you guys recommend to allocate?
Basecalling performance is generally improved with GPU. So I'd recommend using your V100 GPU.
Adding modifications significantly increases the amount of compute per read in both GPU and CPU. So allocate as many resources as possible if wall time is your primary concern.
However, it's unusual to see no performance improvement with the addition of a GPU. If this is truly the case I'd recommend looking investigating if your file I/O speed is a problem and whether using scratch drives on your cluster(*) might give better performance. (*) Assuming you're running on a cluster because you said "allocating" resources.
Can you also monitor your GPU utilisation to see if it's being throttled by slow IO for example?
Can you share your logs for both runs please?
Issue Report
unbelievably slow bascalling using dorado
Please describe the issue:
I have sequenced some samples using direct RNA-seq on promethION machine. I've been running dorado for basecalling, m6A calling and poly-A tail estimation at the same time. But this process seems to be extremely slow. It seems that I was only able to process 2 million reads with 32cores, 400GB memory on a AMD EPYC 7742 2.25 GHz CPU for 96 hours. The situation didnt get any better when I switched to a 24cores NVIDIA Tesla V100 PCIe GPU. Please provide a clear and concise description of the issue you are seeing and the result you expect. Is there anyway to speed up this process? I have 40 more samples to work on.
Steps to reproduce the issue:
Please list any steps to reproduce the issue.
Run environment:
Logs