mikaku / Monitorix

Monitorix is a free, open source, lightweight system monitoring tool.
https://www.monitorix.org
GNU General Public License v2.0
1.12k stars 167 forks source link

Unable to display "Core activity" graph #421

Open v3nko opened 2 years ago

v3nko commented 2 years ago

Monitorix is unable to display the "Core activity" graph in the "Interrupt activity" section.

image

URL: <host>/monitorix.cgi?mode=localhost&graph=_int3&when=1day&color=black

Related log messages:

Use of uninitialized value in multiplication (*) at /usr/lib/monitorix/int.pm line 1190.
Use of uninitialized value in multiplication (*) at /usr/lib/monitorix/int.pm line 1191.

When I'm trying to open this graph separately (by clicking on currented image indicator), the "Not found" error is displayed:

image

However, If I change the period for the graph that I open separately in URL from 1day to something else like 4day it will be displayed (but empty):

image

Once I change this period in first URL (monitorix.cgi?mode=localhost&graph=_int3&when=4day&color=black) this will break this graph with this period even in a separate window.

image

Monitorix v3.14.0

mikaku commented 2 years ago

Please, paste here the <int> section in your configuration file, and also, the output of the command cat /proc/stat (assuming you are using Linux).

v3nko commented 2 years ago

Configuration:

# INT graph
# -----------------------------------------------------------------------------
<int>
        rigid = 0, 0, 0
        limit = 100, 100, 100
</int>

Output of cat /proc/stat:

cpu  30170802 170823 4737940 986556187 3138262 0 2465258 0 0 0
cpu0 8302477 41307 1160787 248239376 710654 0 391738 0 0 0
cpu1 8118832 44706 1118046 246485279 894017 0 1517967 0 0 0
cpu2 6744486 42703 1256266 242630507 839756 0 109545 0 0 0
cpu3 7005005 42106 1202840 249201025 693834 0 446006 0 0 0
intr 6541409100 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3522784024 34797620 166777191 41 27 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
ctxt 4612101451
btime 1650722980
processes 6737090
procs_running 2
procs_blocked 0
softirq 3666704794 116 1371584846 4411026 177309994 36683527 0 35172711 1435082753 32955 606426866
mikaku commented 2 years ago

Thanks. I'm sorry, please, paste also the output of cat /proc/interrupts.

v3nko commented 2 years ago

No problem.

            CPU0       CPU1       CPU2       CPU3
   8:          0          0          0          0   IO-APIC    8-edge      rtc0
   9:          0          0          0          0   IO-APIC    9-fasteoi   acpi
  16:          0          0          0          0   IO-APIC   16-fasteoi   i801_smbus, snd_hda_intel:card0
 122:          0      92533 4129223409      62335   PCI-MSI 327680-edge      xhci_hcd
 123:          0   36009173          0          0   PCI-MSI 376832-edge      ahci[0000:00:17.0]
 124:          0         19          0  168738760   PCI-MSI 524288-edge      enp1s0
 125:          0          0          0         41   PCI-MSI 360448-edge      mei_me
 126:         27          0          0          0   PCI-MSI 1048576-edge      iwlwifi
 NMI:       8952      10661      15980       7872   Non-maskable interrupts
 LOC:  616487748  597639627  927265749  615812789   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:       8952      10661      15980       7872   Performance monitoring interrupts
 IWI:          0          3          0          0   IRQ work interrupts
 RTR:          1          0          0          0   APIC ICR read retries
 RES:   39431195   21064663   13018076   10415529   Rescheduling interrupts
 CAL:   18374225   17068164   17949614   16774088   Function call interrupts
 TLB:   29479986   28443384   28579709   27584445   TLB shootdowns
 TRM:          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:       8570       8571       8571       8571   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event
mikaku commented 2 years ago

Monitorix is unable to display the "Core activity" graph in the "Interrupt activity" section.

This is because your system only has 8 active interrupts (8, 9, 16, 122, 123, 124, 125 and 126). The current code of int.pm is pretty outdated, it was developed long time ago and don't know how to deal with the current interrupt names. This module needs some love urgently and this is something I already have in my ToDo list.

However, If I change the period for the graph that I open separately in URL from 1day to something else like 4day it will be displayed (but empty):

I was unable to reproduce this. Moreover, you should see the path /monitorix/imgs/int2z.1day.png, not //imgs/int2z.1day.png.

Anyway, I think you already have an old int2z.4day.png file in imgs/ created some time ago. Check the time stamp of this file and it will reveal you more information.

mikaku commented 2 years ago

Did you check in the imgs/ directory to confirm you already have an old int2z.4day.png file?

v3nko commented 2 years ago

Seems I don't have it locally now. I believe once I tried to change the range in the general URL, it became inaccessible via direct URL also. This is what I have:

***:/var/lib/monitorix/www/imgs$ ll | grep int2z.4
-rw-r--r-- 1 www-data nogroup  14016 Aug 28  2017 int2z.48hour.png
-rw-r--r-- 1 www-data nogroup  14632 Sep  8  2017 int2z.4hour.png

Moreover, you should see the path /monitorix/imgs/int2z.1day.png, not //imgs/int2z.1day.png.

This is probably because I changed the base URL. I have a separate subdomain for Monitorix, so I prefer a cleaner path.

base_url = /
base_cgi = /
mikaku commented 2 years ago

This is probably because I changed the base URL. I have a separate subdomain for Monitorix, so I prefer a cleaner path.

Ah OK.

I have a server here that have similar results as you posted in the first message. As I said, this is because Monitorix expects specific interrupt numbers for each graph. The code is too old now and it must be rewritten to create graphs dynamically depending of the interrupt numbers in the system.

arigit commented 12 months ago

Same issue here, monitorix 3.15.0

The output of /proc/interrupts is:

            CPU0       CPU1       CPU2       CPU3       
   8:          0          0          0          0  IR-IO-APIC    8-edge      rtc0
   9:          0          2          0          0  IR-IO-APIC    9-fasteoi   acpi
  16:          0          0          0          2  IR-IO-APIC   16-fasteoi   i801_smbus
 120:          0          0          0          0  DMAR-MSI    0-edge      dmar0
 121:          0          0          0          0  DMAR-MSI    1-edge      dmar1
 122:          0          0          0          0  IR-PCI-MSI-0000:00:1c.0    0-edge      PCIe PME
 123:          0          0          0          0  IR-PCI-MSI-0000:00:1d.0    0-edge      PCIe PME
 124:          0          0          0          0  IR-PCI-MSI-0000:00:0d.0    0-edge      xhci_hcd
 125:      69565     102668     109366       5187  IR-PCI-MSI-0000:00:17.0    0-edge      ahci[0000:00:17.0]
 126:  139976163          0          0          0  IR-PCI-MSI-0000:00:14.0    0-edge      xhci_hcd
 127:          0          0        283  448869699  IR-PCI-MSIX-0000:01:00.0    0-edge      enp1s0
 128:      47546      65555      43776          0  IR-PCI-MSIX-0000:02:00.0    0-edge      nvme0q0
 129:    1575231          0          0          0  IR-PCI-MSIX-0000:02:00.0    1-edge      nvme0q1
 130:          0    1608607          0          0  IR-PCI-MSIX-0000:02:00.0    2-edge      nvme0q2
 131:          0          0    1666757          0  IR-PCI-MSIX-0000:02:00.0    3-edge      nvme0q3
 132:          0          0          0    1575722  IR-PCI-MSIX-0000:02:00.0    4-edge      nvme0q4
 133:       5408   12658846   37600107          0  IR-PCI-MSI-0000:00:02.0    0-edge      i915
 134:         76          0          0          0  IR-PCI-MSI-0000:00:16.0    0-edge      mei_me
 135:          0      11170          0          0  IR-PCI-MSI-0000:00:1f.3    0-edge      snd_hda_intel:card0
 NMI:      11260      10694      10441      12708   Non-maskable interrupts
 LOC:  255888860  257442068  270110441  512147563   Local timer interrupts
 SPU:          0          0          0          0   Spurious interrupts
 PMI:      11260      10694      10441      12708   Performance monitoring interrupts
 IWI:    1794761    8453046   20410090    1917361   IRQ work interrupts
 RTR:          0          0          0          0   APIC ICR read retries
 RES:    9347765    9430997   11060128    4130392   Rescheduling interrupts
 CAL:   22005607   18934750   19604413   28558271   Function call interrupts
 TLB:    3454214    3466218    3580003    3488147   TLB shootdowns
 TRM:         68         68         68         68   Thermal event interrupts
 THR:          0          0          0          0   Threshold APIC interrupts
 DFR:          0          0          0          0   Deferred Error APIC interrupts
 MCE:          0          0          0          0   Machine check exceptions
 MCP:       1233       1234       1234       1234   Machine check polls
 ERR:          0
 MIS:          0
 PIN:          0          0          0          0   Posted-interrupt notification event
 NPI:          0          0          0          0   Nested posted-interrupt event
 PIW:          0          0          0          0   Posted-interrupt wakeup event

This is an intel n100 SBC box (x86_64) running kernel 6.5.0

image

hoping that the code rewrite will happen at some point, it's a really useful graph

mikaku commented 11 months ago

The output of /proc/interrupts is:

Thanks, I really appreciate your feedback.

hoping that the code rewrite will happen at some point, it's a really useful graph

Yes, I know. As soon as time permits I'll start rewriting this graph with some ideas I already have.