b3rs3rk / gpustat-unraid

An UnRAID plugin for displaying GPU status
https://forums.unraid.net/topic/89453-plugin-gpu-statistics/?tab=comments#comment-830112
MIT License
50 stars 13 forks source link

[BUG] - Briefly describe issue here #5

Closed fiservedpi closed 3 years ago

fiservedpi commented 4 years ago

Describe the bug A clear and concise description of what the bug is. Log spammed with error below from the moment I install 18:07 is when I installed never had these before not that big of deal but will get out of hand if inclined/truncated. Plugin WORKS! As expected :07:18 Tower emhttpd: cmd: /usr/local/emhttp/plugins/community.applications/scripts/pluginInstall.sh install https://raw.githubusercontent.com/b3rs3rk/gpustat-unraid/master/gpustat.plg Jun 18 18:07:18 Tower root: plugin: creating: /boot/config/plugins/gpustat/gpustat-2020.04.18a-x86_64.txz - downloading from URL https://raw.githubusercontent.com/b3rs3rk/gpustat-unraid/master/pkg/gpustat-2020.04.18a-x86_64.txz Jun 18 18:07:19 Tower root: plugin: running: /boot/config/plugins/gpustat/gpustat-2020.04.18a-x86_64.txz Jun 18 18:07:19 Tower root: plugin: creating: /boot/config/plugins/gpustat/gpustat.cfg - from INLINE content Jun 18 18:07:26 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:26 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:30 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:30 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:33 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:33 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:36 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:36 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:39 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:39 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:42 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:42 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:44 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:44 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:47 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:47 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:49 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:49 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia] mapping multiple BARs Jun 18 18:07:52 Tower kernel: resource sanity check: requesting [mem 0x000c0000-0x000fffff], which spans more than PCI Bus 0000:00 [mem 0x000c0000-0x000dffff window] Jun 18 18:07:52 Tower kernel: caller _nv000908rm+0x1bf/0x1f0 [nvidia

To Reproduce Steps to reproduce the behavior:

  1. Go to CA
  2. Install GPU Stats
  3. See error

Expected behavior Work without these errors

Screenshots If applicable, add screenshots to help explain your problem. N/A Client (please complete the following information):

Server (please complete the following information):

Additional context Add any other context about the problem here. : > /var/log/syslog to clear the errors but they will return Screenshot_20200618-181449

Jumped over to the support thread and saw that this is almost certainly an Nvidia driver issue so I guess this can be closed.

b3rs3rk commented 3 years ago

@fiservedpi I realize some time has elapsed since I addressed this. Must have missed the notification from GH. Long story short, this isn't something I can fix. The utility throws these 'good errors' everytime I invoke it. If you completely remove my plugin from your system and run watch -n 2 nvidia-smi -q -x from your UnRAID console, you will still see these errors. It only seems worse with my plugin because I'm constantly invoking that process. I don't think this is something I can fix, nor is it likely fixable by the UnRAID-Nvidia folks. Has something to do with the way the hardware configuration is setup. In the support forum post for my plugin I listed a possible way that you can squelch these logs so that they don't show up anymore. Maybe try that.

EDIT: Check this out. The link to the Nvidia forum in this forum post says they changed the Video OpRom setting in their BIOS from Legacy to UEFI and it fixed these logs.

https://forums.unraid.net/topic/99194-log-filling-up-with-errors/?tab=comments#comment-915410

fiservedpi commented 3 years ago

That’s cool thanks for following up