Breakend / experiment-impact-tracker

MIT License
266 stars 31 forks source link

Monitor thread errors out with IndexError #58

Closed nikhil153 closed 3 years ago

nikhil153 commented 3 years ago

I am trying to test out a few different software and it seems that for some of them the monitor thread errors out during intel RAPL calls. I have tested this on two different CPUs (1) Intel(R) Core(TM) i7-8650U CPU @ 1.90GHz and 2) Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz.

experiment_impact_tracker.compute_tracker.ImpactTracker - ERROR - Encountered exception within power monitor thread!
ERROR:Encountered exception within power monitor thread!                                                                                                                                      
experiment_impact_tracker.compute_tracker.ImpactTracker - ERROR -   File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 161, in launch_power_monitor
    _sample_and_log_power(log_dir, initial_info, logger=logger)                                                                                                                               
  File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 112, in _sample_and_log_power
    log_dir=log_dir,                                                                                                                                                                          
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 88, in get_intel_power
    return get_rapl_power(pid_list, logger, **kwargs)                                                                                                                                         
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 435, in get_rapl_power
    st2, st22, system_wide_pt2, pt2 = infos2[i]                                                                                                                                               

ERROR:  File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 161, in launch_power_monitor                                                                
    _sample_and_log_power(log_dir, initial_info, logger=logger)
  File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 112, in _sample_and_log_power                                                                     
    log_dir=log_dir,                                                                           
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 88, in get_intel_power
    return get_rapl_power(pid_list, logger, **kwargs)                                                                                                                                         
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 435, in get_rapl_power                                                             
    st2, st22, system_wide_pt2, pt2 = infos2[i]   

Process Process-1:                      
Traceback (most recent call last):          
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap  
    self.run()                                                                                 
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run                        
    self._target(*self._args, **self._kwargs)                                                                                                                                                 
  File "../../experiment-impact-tracker/experiment_impact_tracker/utils.py", line 68, in process_func
    raise e                                                                                                                                                                                   
  File "../../experiment-impact-tracker/experiment_impact_tracker/utils.py", line 62, in process_func                                                                                         
    ret = func(q, *args, **kwargs)                                                                                                                                                            
  File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 161, in launch_power_monitor
    _sample_and_log_power(log_dir, initial_info, logger=logger)                                                                                                                               
  File "../../experiment-impact-tracker/experiment_impact_tracker/compute_tracker.py", line 112, in _sample_and_log_power
    log_dir=log_dir,                                                                                                                                                                          
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 88, in get_intel_power
    return get_rapl_power(pid_list, logger, **kwargs)                                                                                                                                         
  File "../../experiment-impact-tracker/experiment_impact_tracker/cpu/intel.py", line 435, in get_rapl_power
    st2, st22, system_wide_pt2, pt2 = infos2[i]                                                                                                                                               
IndexError: list index out of range  

Any suggestions? Thanks!

Breakend commented 3 years ago

Apologies for the bug and thanks for fixing it. We've merged your fix in, but please re-open if it's still an issue.