amanusk / s-tui

Terminal-based CPU stress and monitoring utility
https://amanusk.github.io/s-tui/
GNU General Public License v2.0
4.06k stars 140 forks source link

stress process keeps running after a crash while in stress mode #93

Closed christopher-dG closed 5 years ago

christopher-dG commented 5 years ago

Step 1: Describe your environment

Step 2: Describe the problem:

When the program crashes in stress mode, stress keeps running in the background, keeping the CPU pinned.

Observed Results:

I'm using #92 to reproduce this.

Traceback (most recent call last):
  File "/usr/bin/s-tui", line 11, in <module>
    load_entry_point('s-tui==0.8.2', 'console_scripts', 's-tui')()
  File "/usr/lib/python3.7/site-packages/s_tui/s_tui.py", line 928, in main
    graph_controller.main()
  File "/usr/lib/python3.7/site-packages/s_tui/s_tui.py", line 760, in main
    self.loop.run()
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 286, in run
    self._run()
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 384, in _run
    self.event_loop.run()
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 788, in run
    self._loop()
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 825, in _loop
    self._watch_files[fd]()
  File "/usr/lib/python3.7/site-packages/urwid/raw_display.py", line 404, in <lambda>
    event_loop, callback, self.get_available_raw_input())
  File "/usr/lib/python3.7/site-packages/urwid/raw_display.py", line 502, in parse_input
    callback(processed, processed_codes)
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 411, in _update
    self.process_input(keys)
  File "/usr/lib/python3.7/site-packages/urwid/main_loop.py", line 511, in process_input
    k = self._topmost_widget.keypress(self.screen_size, k)
  File "/usr/lib/python3.7/site-packages/urwid/container.py", line 595, in keypress
    *self.calculate_padding_filler(size, True)), key)
  File "/usr/lib/python3.7/site-packages/urwid/container.py", line 1590, in keypress
    key = self.focus.keypress(tsize, key)
  File "/usr/lib/python3.7/site-packages/urwid/container.py", line 2271, in keypress
    key = w.keypress((mc,) + size[1:], key)
  File "/usr/lib/python3.7/site-packages/s_tui/UiElements.py", line 37, in keypress
    return super(ViListBox, self).keypress(size, key)
  File "/usr/lib/python3.7/site-packages/urwid/listbox.py", line 999, in keypress
    key = focus_widget.keypress((maxcol,),key)
  File "/usr/lib/python3.7/site-packages/urwid/container.py", line 2271, in keypress
    key = w.keypress((mc,) + size[1:], key)
  File "/usr/lib/python3.7/site-packages/urwid/wimp.py", line 540, in keypress
    self._emit('click')
  File "/usr/lib/python3.7/site-packages/urwid/widget.py", line 460, in _emit
    signals.emit_signal(self, name, self, *args)
  File "/usr/lib/python3.7/site-packages/urwid/signals.py", line 265, in emit
    result |= self._call_callback(callback, user_arg, user_args, args)
  File "/usr/lib/python3.7/site-packages/urwid/signals.py", line 295, in _call_callback
    return bool(callback(*args_to_pass))
  File "/usr/lib/python3.7/site-packages/s_tui/TempSensorsMenu.py", line 114, in on_apply
    self.return_fn()
  File "/usr/lib/python3.7/site-packages/s_tui/s_tui.py", line 289, in on_sensors_menu_close
    self.__init__(self.controller)
  File "/usr/lib/python3.7/site-packages/s_tui/s_tui.py", line 233, in __init__
    urwid.WidgetPlaceholder.__init__(self, self.main_window())
  File "/usr/lib/python3.7/site-packages/s_tui/s_tui.py", line 552, in main_window
    self.controller.temp_thresh)
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/TemperatureSource.py", line 54, in __init__
    self.update()
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/TemperatureSource.py", line 107, in update
    update_func(sensor_major, int(sensor_minor))
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/TemperatureSource.py", line 93, in update_func
    Source.update(self)
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/Source.py", line 28, in update
    self.eval_hooks()
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/Source.py", line 68, in eval_hooks
    if self.get_edge_triggered():
  File "/usr/lib/python3.7/site-packages/s_tui/Sources/TemperatureSource.py", line 190, in get_edge_triggered
    return self.last_temp > self.temp_thresh
TypeError: '>' not supported between instances of 'float' and 'NoneType'

Debug Results, output of s-tui -d created in a file _s-tui.log:

2018-10-08 12:24:18,212 [main()] [INFO ]  Started without root permissions
2018-10-08 12:24:18,212 [__init__()] [DEBUG]  Config file not found
2018-10-08 12:24:18,212 [__init__()] [DEBUG]  No refresh rate configed
2018-10-08 12:24:18,212 [__init__()] [DEBUG]  No user config for utf8
2018-10-08 12:24:18,212 [__init__()] [DEBUG]  No user config for temp sensor
2018-10-08 12:24:18,212 [__init__()] [DEBUG]  No user config for temp threshold
2018-10-08 12:24:18,214 [__init__()] [DEBUG]  stress-ng is not installed
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Package id 0
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Core 0
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Core 1
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Core 2
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Core 3
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:18,218 [__init__()] [DEBUG]  
2018-10-08 12:24:18,219 [__init__()] [INFO ]  num cpus 8
2018-10-08 12:24:18,362 [update()] [INFO ]  Utilization recorded 7.5
2018-10-08 12:24:18,363 [__init__()] [DEBUG]  arg temp  None
2018-10-08 12:24:18,363 [init_update()] [DEBUG]  custom temp is None
2018-10-08 12:24:18,369 [init_update()] [DEBUG]  Temperature sensor is set to coretemp
2018-10-08 12:24:18,376 [set_threshold()] [DEBUG]  Temperature threshold set to 80.0
2018-10-08 12:24:18,382 [__init__()] [DEBUG]  Update is updated to <function TemperatureSource.init_update.<locals>.update at 0x7fa1213faea0>
2018-10-08 12:24:18,383 [get_power_usage()] [INFO ]  current 125532353624.0 last 125532353624.0
2018-10-08 12:24:18,383 [get_power_usage()] [INFO ]  Joule_Used 0.0 seconds_passed 0.0004298686981201172
2018-10-08 12:24:18,386 [update()] [DEBUG]  Fan Speend Not Available
2018-10-08 12:24:18,386 [update()] [INFO ]  Fan speed recorded0.0
2018-10-08 12:24:18,419 [on_unicode_checkbox()] [DEBUG]  unicode State is False
2018-10-08 12:24:18,427 [update()] [INFO ]  Utilization recorded 18.9
2018-10-08 12:24:18,432 [get_power_usage()] [INFO ]  current 125532793503.0 last 125532353624.0
2018-10-08 12:24:18,432 [get_power_usage()] [INFO ]  Joule_Used 0.439879 seconds_passed 0.049030303955078125
2018-10-08 12:24:18,432 [get_power_usage()] [INFO ]  Power reading elapsed
2018-10-08 12:24:18,433 [get_power_usage()] [INFO ]  Max power updated 9
2018-10-08 12:24:18,433 [update_displayed_graph_data()] [INFO ]  Reading 800
2018-10-08 12:24:18,433 [update_displayed_graph_data()] [INFO ]  Reading 18.9
2018-10-08 12:24:18,433 [update_displayed_graph_data()] [INFO ]  Reading 36.0
2018-10-08 12:24:18,434 [update_displayed_graph_data()] [INFO ]  Reading 8.97157399642107
2018-10-08 12:24:18,438 [update()] [INFO ]  Utilization recorded 25.0
2018-10-08 12:24:18,441 [get_power_usage()] [INFO ]  current 125532902939.0 last 125532793503.0
2018-10-08 12:24:18,441 [get_power_usage()] [INFO ]  Joule_Used 0.109436 seconds_passed 0.009230375289916992
2018-10-08 12:24:18,442 [get_power_usage()] [INFO ]  Power reading elapsed
2018-10-08 12:24:18,442 [get_power_usage()] [INFO ]  Max power updated 12
2018-10-08 12:24:18,442 [update_displayed_graph_data()] [INFO ]  Reading 1266
2018-10-08 12:24:18,442 [update_displayed_graph_data()] [INFO ]  Reading 25.0
2018-10-08 12:24:18,442 [update_displayed_graph_data()] [INFO ]  Reading 36.0
2018-10-08 12:24:18,443 [update_displayed_graph_data()] [INFO ]  Reading 11.85607264739765
2018-10-08 12:24:19,361 [update()] [INFO ]  Utilization recorded 9.7
2018-10-08 12:24:19,367 [get_power_usage()] [INFO ]  current 125541123865.0 last 125532902939.0
2018-10-08 12:24:19,367 [get_power_usage()] [INFO ]  Joule_Used 8.220926 seconds_passed 0.9256186485290527
2018-10-08 12:24:19,367 [get_power_usage()] [INFO ]  Power reading elapsed
2018-10-08 12:24:19,368 [update_displayed_graph_data()] [INFO ]  Reading 957
2018-10-08 12:24:19,369 [update_displayed_graph_data()] [INFO ]  Reading 9.7
2018-10-08 12:24:19,370 [update_displayed_graph_data()] [INFO ]  Reading 26.0
2018-10-08 12:24:19,371 [update_displayed_graph_data()] [INFO ]  Reading 8.88154750670191
2018-10-08 12:24:19,372 [kill_child_processes()] [DEBUG]  Killing stress process
2018-10-08 12:24:19,373 [kill_child_processes()] [DEBUG]  No such process
2018-10-08 12:24:19,373 [kill_child_processes()] [DEBUG]  Could not kill process
2018-10-08 12:24:20,585 [update()] [INFO ]  Utilization recorded 98.1
2018-10-08 12:24:20,587 [get_power_usage()] [INFO ]  current 125613565574.0 last 125541123865.0
2018-10-08 12:24:20,593 [get_power_usage()] [INFO ]  Joule_Used 72.441709 seconds_passed 1.2202279567718506
2018-10-08 12:24:20,593 [get_power_usage()] [INFO ]  Power reading elapsed
2018-10-08 12:24:20,593 [get_power_usage()] [INFO ]  Max power updated 60
2018-10-08 12:24:20,593 [update_displayed_graph_data()] [INFO ]  Reading 3999
2018-10-08 12:24:20,593 [update_displayed_graph_data()] [INFO ]  Reading 98.1
2018-10-08 12:24:20,593 [update_displayed_graph_data()] [INFO ]  Reading 51.0
2018-10-08 12:24:20,594 [update_displayed_graph_data()] [INFO ]  Reading 59.36735722040552
2018-10-08 12:24:22,349 [on_sensors_menu_close()] [INFO ]  State is not None
2018-10-08 12:24:22,352 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,352 [__init__()] [DEBUG]  
2018-10-08 12:24:22,352 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Package id 0
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Core 0
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Core 1
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Core 2
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Core 3
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  Sensor Label
2018-10-08 12:24:22,353 [__init__()] [DEBUG]  
2018-10-08 12:24:22,354 [__init__()] [INFO ]  num cpus 8
2018-10-08 12:24:22,493 [update()] [INFO ]  Utilization recorded 100.0
2018-10-08 12:24:22,493 [__init__()] [DEBUG]  arg temp  pch_skylake,0,
2018-10-08 12:24:22,493 [init_update()] [DEBUG]  custom temp is pch_skylake,0,
2018-10-08 12:24:22,493 [init_update()] [DEBUG]  Selected custom temp
2018-10-08 12:24:22,493 [init_update()] [DEBUG]  Major pch_skylake Minor 0
2018-10-08 12:24:22,495 [set_threshold()] [DEBUG]  Temperature threshold set to None

Step 3: Reproduce the problem:

Steps to reproduce:

  1. start s-tui
  2. enter stress mode
  3. find a way to make it crash
  4. observe stress still running
$ s-tui
# Make it crash
$ pidof stress
27898 27897 27896 27895 27894 27893 27892 27891 27890
$ pkill stress
amanusk commented 5 years ago

Thanks for opening the issue. PR #89 should fix the problem with a threshold value set to None, and in turn there should be not crash with stress processes running. Could you please help test this by pulling from master?

amanusk commented 5 years ago

Assuming this is closed. Please reopen if needed