Breakend / experiment-impact-tracker

MIT License
266 stars 31 forks source link

ParserError parsing nvidia-smi output #36

Open cifkao opened 3 years ago

cifkao commented 3 years ago

I get the following error when starting the tracker:

Traceback (most recent call last):                                                                                                                                                                          
  File "/usr/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap                                                                                                                             
    self.run()                                                                                                                                                                                              
  File "/usr/lib/python3.7/multiprocessing/process.py", line 99, in run                                                                                                                                     
    self._target(*self._args, **self._kwargs)                                                                                                                                                               
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/experiment_impact_tracker/utils.py", line 68, in process_func                                        
    raise e                                                                                                                                                                                                 
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/experiment_impact_tracker/utils.py", line 62, in process_func                                        
    ret = func(q, *args, **kwargs)                                                                                                                                                                          
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/experiment_impact_tracker/compute_tracker.py", line 105, in launch_power_monitor                     
    _sample_and_log_power(log_dir, initial_info, logger=logger)                                                                                                                                             
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/experiment_impact_tracker/compute_tracker.py", line 69, in _sample_and_log_power
    results = header["routing"]["function"](process_ids, logger=logger, region=initial_info['region']['id'], log_dir=log_dir)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/experiment_impact_tracker/gpu/nvidia.py", line 123, in get_nvidia_gpu_power
    df = pd.read_csv(StringIO(out_str_final), engine='python', delim_whitespace=True)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 688, in read_csv
    return _read(filepath_or_buffer, kwds)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 460, in _read
    data = parser.read(nrows)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 1198, in read
    ret = self._engine.read(nrows)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 2585, in read
    alldata = self._rows_to_cols(content)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 3237, in _rows_to_cols
    self._alert_malformed(msg, row_num + 1)
  File "/tsi/doctorants/ocifka/projects/phd/experiments/lakhnes_cover/venv/lib/python3.7/site-packages/pandas/io/parsers.py", line 2998, in _alert_malformed
    raise ParserError(msg)
pandas.errors.ParserError: Expected 8 fields in line 4, saw 9. Error could possibly be due to quotes being ignored when a multi-char delimiter is used.

This seems to be a problem with the output of the command nvidia-smi pmon -c 5, which gives the following output on my machine:

# gpu        pid  type    sm   mem   enc   dec   command
# Idx          #   C/G     %     %     %     %   name
    0       9122     G     0     3     0     0   Xorg           
    0      11344     G     0     0     0     0   chromium --type
    0      22948     C     0     0     0     0   python3        
    0       9122     G     0     3     0     0   Xorg           
    0      11344     G     0     0     0     0   chromium --type
    0      22948     C     0     0     0     0   python3        
    0       9122     G     0     3     0     0   Xorg           
    0      11344     G     0     0     0     0   chromium --type
    0      22948     C     0     0     0     0   python3        
    0       9122     G     0     3     0     0   Xorg           
    0      11344     G     0     0     0     0   chromium --type
    0      22948     C     0     0     0     0   python3        
    0       9122     G     0     3     0     0   Xorg           
    0      11344     G     0     0     0     0   chromium --type
    0      22948     C     0     0     0     0   python3

I'm guessing the problem is chromium --type being two words.