Closed mvesin closed 3 years ago
This is totally correct. Apologies for the error. We've added a check in #42 which prevents this from happening silently again and fixed the USS + PSS logic to only rely on PSS if the system supports PSS.
Thanks so much for raising this issue.
Hi, I'm getting a similar error where total_intel_power < total_attributable_power.
Trace as requested:
Traceback (most recent call last): File "/usr/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/usr/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, *self._kwargs) File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/utils.py", line 68, in process_func raise e File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/utils.py", line 62, in process_func ret = func(q, args, kwargs) File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/compute_tracker.py", line 161, in launch_power_monitor _sample_and_log_power(log_dir, initial_info, logger=logger) File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/compute_tracker.py", line 108, in _sample_and_log_power results = header["routing"]["function"]( File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/cpu/intel.py", line 88, in get_intel_power return get_rapl_power(pid_list, logger, kwargs) File "/home/chrisp44/.local/lib/python3.8/site-packages/experiment_impact_tracker-0.1.9-py3.8.egg/experiment_impact_tracker/cpu/intel.py", line 564, in get_rapl_power raise ValueError( ValueError: For some reason the total intel estimated power is less than the attributable power. This means there is an error in computing the attribution. Please re-open https://github.com/Breakend/experiment-impact-tracker/issues/38 and add the trace for this warning.
This seems to be due to the fact that cpu_percent ends up being slightly greater than 1 (in my case it's usually around 1.005).
My quick local fix is to just cap power_credit_cpu to 1.0, although it doesn't fix the potential over-estimation problem. However, this only seems to happen when there's not much else running on the machine.
Hi all, thanks for this great tool.
I observed strange results using experiment-impact-tracker, with a rapl_power_draw_absolute < rapl_estimated_attributable_power_draw
I suspect a problem in the attributable memory counting method :
Can someone confirm / correct this statement ?
Shouldn't relative_mem_usage be <= 1 ? Tracing further :
I patched locally and results look much more like expected (attributable power draw slightly under absolute power draw).
Tested on :