Closed jue-jue-zi closed 1 year ago
@jue-jue-zi Thanks for the feedback! I'll add a quick fix soon.
@jue-jue-zi I pushed a new commit to handle this. You can reinstall nvitop
from GitHub by:
pip3 install git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop
@jue-jue-zi I pushed a new commit to handle this. You can reinstall
nvitop
from GitHub by:pip3 install git+https://github.com/XuehaiPan/nvitop.git#egg=nvitop
Thanks for fixing it so soon, but it seems that there still exist some problems,
Traceback (most recent call last):
File "/usr/local/bin/nvitop", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.8/dist-packages/nvitop/cli.py", line 336, in main
ui = UI(
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/ui.py", line 43, in __init__
self.main_screen = MainScreen(
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/screens/main/__init__.py", line 38, in __init__
self.device_panel = DevicePanel(self.devices, compact, win=win, root=root)
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/screens/main/device.py", line 61, in __init__
self.snapshots = self.take_snapshots()
File "/usr/local/lib/python3.8/dist-packages/cachetools/func.py", line 62, in wrapper
v = func(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/screens/main/device.py", line 129, in take_snapshots
snapshots = [device.as_snapshot() for device in self.all_devices]
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/screens/main/device.py", line 129, in <listcomp>
snapshots = [device.as_snapshot() for device in self.all_devices]
File "/usr/local/lib/python3.8/dist-packages/nvitop/gui/library/device.py", line 70, in as_snapshot
self._snapshot = super().as_snapshot()
File "/usr/local/lib/python3.8/dist-packages/nvitop/core/device.py", line 1667, in as_snapshot
**{key: getattr(self, key)() for key in self.SNAPSHOT_KEYS},
File "/usr/local/lib/python3.8/dist-packages/nvitop/core/device.py", line 1667, in <dictcomp>
**{key: getattr(self, key)() for key in self.SNAPSHOT_KEYS},
File "/usr/local/lib/python3.8/dist-packages/nvitop/core/device.py", line 878, in memory_used
return self.memory_info().used
File "/usr/local/lib/python3.8/dist-packages/nvitop/core/utils.py", line 702, in wrapped
ret = self._cache[method] # pylint: disable=protected-access
TypeError: 'function' object is not subscriptable
but it seems that there still exist some problems,
Fixed by the newest commit.
It works right now! Thanks, it is a really great project.
It works right now! Thanks, it is a really great project.
Maybe red fonts for errors would be better.
Runtime Environment
nvitop
version or commit: 0.10.0nvidia-ml-py
version: 11.515.75Current Behavior
There are four GPUs on our server. And one of those was overheated for some reasons, which make that GPU cannot be recognized. If run
nvidia-smi
command without any args to query all the GPUs, errorUnable to determine the device handle for GPU 0000:0C:00.0: Unknown Error
will show without showing the remaining normal GPUs' infos. But if the command assigns the normal GPUs (nvidia-smi -i 0,1,3
), all infos of the normal GPUs can be shown directly.And if I use
nvitop
command to show the GPUs' infos,nvidia-ml-py
will throw exceptions like this below,Expected Behavior
I hope that with
nvitop
command, all the GPUs with errors can be skipped automatically, and show the normal GPUs' infos. If possible, maybe the error GPUs' info can be shown as tips below the normal infos using red fonts for emphasizing.