radical-cybertools / radical.analytics

Analytics for RADICAL-Cybertools
Other
1 stars 1 forks source link

More information about resource utilization is needed? #111

Open lee212 opened 4 years ago

lee212 commented 4 years ago

Resource utilization is revealed in detail e.g. per resource slot (index) in the matplotlib figure png file but it doesn't seem to have enough information in the stats file. Values in the stats are elapsed seconds of particular metrics e.g. Execution Cmd, Draining, which are important for TTX calculation. What I am interested in, however, is to see how many resources e.g. CPU cores are busy versus idle in a given time. Right now, I manually divide the core seconds of Execution Cmd from the stats file by allocated number of cores to produce a percentage. I believe provided and consumed would be sufficient to be added in the stat file for more information on resource utilization.

andre-merzky commented 4 years ago

Hi @lee212 ,

can you have a look at the dictionaries returned in https://github.com/radical-cybertools/radical.analytics/blob/devel/bin/rp_inspect/plot_util.py#L115 (from ra.Session.utilization) - does that help?

Those dicts contain rather fine grained information about what unit or pilot utilized what core for what reason. This is an internal data structure, so it's not well documented - but you may want to dump them with pprint and have a look. If that is what you are looking for, I can add some documentation (the method needs that anyway...).

lee212 commented 4 years ago

Yes, I was looking at the line https://github.com/radical-cybertools/radical.analytics/blob/14b958106557030e9572a7373f4342e4f6741581/src/radical/analytics/session.py#L1050 and I thought two return values, provided and consumed from get_provided_resources and get_consumed_resources might be good to be added in the stat file.

andre-merzky commented 4 years ago

The information in them is usually too voluminous to be printed in detail - those basically contain (IIRC) tuples like [core_start, core_end, time_start, time_end] for each and every activity in the pilot, so way to much to present in detail. But out of those numbers, you should be able to write a script which does the additions for the data you are interested in?

mturilli commented 4 years ago

@lee212 should we close this ticket?