cancerit / PCAP-core

NGS reference implementations and helper code for mapping (originally part of ICGC-TCGA-PanCancer)
GNU General Public License v2.0
9 stars 10 forks source link

Speed up job resume (Threaded.pm) #40

Closed keiranmraine closed 5 years ago

keiranmraine commented 5 years ago

Once initial scale up is complete the start interval can be far shorter:

https://github.com/cancerit/PCAP-core/blob/137acb9379efae21fc3c208320b332f454d47d1f/lib/PCAP/Threaded.pm#L162

Currently 2 seconds which even for pulldown can add 30 min of no work.

keiranmraine commented 5 years ago

Possibly safest to use a slight sleep:

use Time::HiRes qw( usleep );
const MSEC_SLEEP_INT => 100_000; # 0.1 sec
usleep ($microseconds);

Above would give slight stagger to reduce I/O flooding and bring resuming a genome in caveman estep from ~6h to ~20m.

This was identified when running GRCh38 pulldown sample with a container. The analysis ran until flagging and then failed for memory.

keiranmraine commented 5 years ago

If people are using external_process_handler within the function passed a 0.1-0.2 sec sleep already exists, dropping this to 0.1 as used in loop anyway.