nanoporetech / medaka

Sequence correction provided by ONT Research
https://nanoporetech.com
Other
391 stars 73 forks source link

Medaka running slow - only uses 1% of CPU #452

Closed SidselC closed 10 months ago

SidselC commented 10 months ago

Hi, I am running medaka 1.9.0 on an Flye assembly of a fungal genome with ~5 GB of data, but it is running super slow at the following stage: ... [16:24:19 - PWorker] Batches in cache: 8. [16:24:19 - Sampler] Initializing sampler for consensus of region contig_40:1743051-1745577. [16:24:19 - Feature] Processed contig_40:1743051.0-1745576.0 (median depth 1.0) [16:24:19 - Sampler] Took 0.00s to make features. [16:24:23 - PWorker] 17.1% Done (0.9/5.2 Mbases) in 3410.9s [16:24:23 - Sampler] Initializing sampler for consensus of region contig_40:1748021-1754580. [16:24:23 - Feature] Processed contig_40:1748021.0-1754579.0 (median depth 1.0) [16:24:23 - Sampler] Took 0.00s to make features. [16:24:38 - PWorker] Batches in cache: 8. [16:24:38 - PWorker] 17.2% Done (0.9/5.2 Mbases) in 3425.4s [16:24:38 - Sampler] Initializing sampler for consensus of region contig_40:1765253-1767126. [16:24:38 - Feature] Processed contig_40:1765253.0-1767125.0 (median depth 1.0) [16:24:38 - Sampler] Took 0.00s to make features. [16:24:47 - PWorker] Batches in cache: 8. [16:24:47 - Sampler] Initializing sampler for consensus of region contig_40:1771744-1771978. [16:24:47 - Feature] Processed contig_40:1771744.0-1771977.0 (median depth 1.0) [16:24:47 - Sampler] Took 0.00s to make features. ...

When checking the server performance, it only uses 1 % of the CPU.

Is there any way of improving the speed or is this to be expected for this amount of data?

This is my command: \\ medaka_consensus -iPAO33660_pass_barcode04_c506346d6480e25c*.fastq.gz -dassembly.fasta -omedaka_consensus -t64 -mr1041_e82_400bps_sup_v4.2.0 \\

Environment (if you do not have a GPU, write No GPU):

Thanks!