opentargets / issues

Issue tracker for Open Targets Platform and Open Targets Genetics Portal
https://platform.opentargets.org https://genetics.opentargets.org
Apache License 2.0
12 stars 2 forks source link

Parallel finemapping: Optimise resource usage #3315

Closed tskir closed 1 week ago

tskir commented 4 months ago

This issue is a part of the https://github.com/opentargets/issues/issues/3302 epic.

The goal of this issue to configure VM types and task submission parameters so that tasks don't fail due to RAM constraints, but at the same time resources are not wasted.

tskir commented 4 months ago

RAM

As a reminder, the current worker VM type is n2d-highmem-4, with 4 cores and 32 GB of RAM. This is how they were used:

There is also a family of “ultramem” VMs which provide a lot of RAM per one CPU core. I will also briefly look into them to see if this can be a good, cost-effective option.

tskir commented 4 months ago

2. Execution time

In the v5 run, the run limit for a single job was 3600s. I initially suspected that some jobs failed due to the time limit (it was difficult to tell because RAM and time failures don't provide any specific log entries.)

In the v6 run, the run limit was raised to 7200s. No jobs failed due to the time limit. However, upon investigating the benchmarking logs, the longest job in the v6 run took 1911s total, so this doesn't appear to be an issue.

tskir commented 4 months ago

Note to self: see also Storage Class A operations, currently around 350 per row in each run; this can be optimised.