radical-cybertools / radical.analytics

Analytics for RADICAL-Cybertools
Other
1 stars 1 forks source link

Incorrect stat numbers in resource utilization .stats file? #116

Closed lee212 closed 4 years ago

lee212 commented 4 years ago

I've looked at numbers in the .stats file which contains elapsed time across states/events but the numbers in my last run do not add up, i.e.:

16nodes_12_10_10_10_1tasks_1_1_10_10_1gens [1177]
    Agent Nodes         :          0.000     0.000%   !  ['agent']
    Pilot Startup       :      86677.675     3.985%      ['boot', 'setup_1']
    Warmup              :      13376.763     0.615%      ['warm']
    Prepare Execution   :       5468.594     0.251%      ['exec_queue', 'exec_prep']
    Pilot Termination   :    2088267.357    96.015%      ['term']
    Execution RP        :        558.163     0.026%      ['exec_rp', 'exec_sh', 'term_sh', 'term_rp']
    Execution Cmd       :    1587795.900    73.004%      ['exec_cmd']
    Unschedule          :        171.147     0.008%      ['unschedule']
    Draining            :     339710.029    15.619%      ['drain']
    Idle                :     118724.739     5.459%      ['idle']
    total               :    2174934.298   100.000%

    total               :    2174934.298   100.000%
    over                :    2652954.467   121.979%
    work                :    1587795.900    73.004%
    miss                :   -2065816.068   -94.983%
andre-merzky commented 4 years ago

Hi Hyungro - can you please give the hotfix/pilot_util a try? Thank.

EDIT: this needs an update in the metric definitions in RP - please use RP branch hotfix/frontera (in the analysis sandbox, no need to rerun experiments).

mturilli commented 4 years ago

This was confirm by some experiments I am running. That hotfix worked for me but we had also to patch RP to deal with session traces collected with a older version of RP. You may have to do the same?