theiagen / public_health_bioinformatics

Bioinformatics workflows for genomic characterization, submission preparation, and genomic epidemiology of pathogens of public health concern.
GNU General Public License v3.0
37 stars 17 forks source link

Check RAM Midas uses #488

Closed andrewjpage closed 3 months ago

andrewjpage commented 4 months ago

:bug:

:pencil: Describe the Issue

The midas task requests a huge amount of RAM (32GB). This is an expensive VM configuration, particularly now that its run automatically from TheiaProk.

Run the task and figure out how much RAM is normally used (/usr/bin/time -v cmd). Does it really need 32GB? Does it require a 100GB local disk? How much processing time is used and calculate the utilisation of the CPUs? If its not making use of all 4, adjust the task to use less. I it is using all 4, consider bumping it to 8 to reduce the amount of time we are using 32GB RAM. Is this task IO bound?

This task is short. Make it preemptible (spot) so that we can access lower pricing. Google give 30 seconds notice before killing it, so we shouldn't notice any difference.

In the runtime section of the task set:

maxRetries: 3
preemptible: 1
sage-wright commented 4 months ago

/usr/bin/time returned this information for a large (this sample has cleaned reads with file sizes of 600MB):

debconf: delaying package configuration, since apt-utils is not installed
    Command being timed: "run_midas.py species samplename -1 R1.fastq.gz -2 R2.fastq.gz -d db/midas_db_v1.2/ -t 4"
    User time (seconds): 2873.56
    System time (seconds): 20.72
    Percent of CPU this job got: 238%
    Elapsed (wall clock) time (h:mm:ss or m:ss): 20:12.82
    Average shared text size (kbytes): 0
    Average unshared data size (kbytes): 0
    Average stack size (kbytes): 0
    Average total size (kbytes): 0
    Maximum resident set size (kbytes): 669376
    Average resident set size (kbytes): 0
    Major (requiring I/O) page faults: 147
    Minor (reclaiming a frame) page faults: 253564
    Voluntary context switches: 791492
    Involuntary context switches: 408572
    Swaps: 0
    File system inputs: 4157936
    File system outputs: 170784
    Socket messages sent: 0
    Socket messages received: 0
    Signals delivered: 0
    Page size (bytes): 4096
    Exit status: 0

CPU utilization, if evenly split, is about 60% which is decent. Memory is underutilized, so I'm reducing that to 4.