Closed namruf15 closed 6 years ago
What recache method are you using?
Well, to be honest I don't know. I've only created suggested graphs from Devices menu and chosen the NET-SNMP mounted partition. I'm using Spine poller as it is stated above.
Sample output from logs: 03/21/2018 09:00:05 - PCOMMAND Device[5] Device[One_VM] WARNING: Recache Event Detected for Device
How can I check the method type about which you're asking?
What is likely happening is that you are indexing on the index provided by the data query, and that this index changes upon restart. You should do a verbose query and post the contents. You can copy everything to the clipboard from the verbose query results.
@cigamit: Hello
and that this index changes upon restart
What restart do you have in mind? I observed that the OID address is changing on some (not every) recache events
You should do a verbose query and post the contents
Could you provide me an instruction how to do such verbose query? As I wrote above, I have only added devices to Cacti within built in NET-SNMP template and used provided NET-SNMP get mounted partition data query.
Actions --> yellow gear:
When I click this verbose query gear, blank page appears:
One thing I noticed is the default value of re-index method selected to "Uptime" which is described as "When the device SNMP uptime go backwards a Re-Index will be performed":
Are you able to tell me if changing this option to None for example could be a workaround which disable this recache issue?
Yes, you can disable it and you can recache when you need this. Strange, for me this is the verbose output: You have a possible problem.
You need to check for JavaScript errors that are breaking your page. Goto your browsers developers toolbar while on the page, and then goto 'Console'. From there, go back to the page, and press the re-index button. Look for errors in the 'Console' after you press the verbose button. Post back what you find.
Ok, I can see some errors connected to permission denied. This happens to me also sometimes on other sites (when I click Devices or some other tabs). Refresh page resolves the issue. Error below:
To me, It almost sounds like a timeout issue. In the log where the PCOMMAND line was displayed, did you also see pairs of lines for
Recache for Device, data query <id>
Recache successful.
The JavaScript issue has been reported all over the internet. It's likely one of your Firefox Plugins. You should start disabling until it goes away. I also saw some people saying that this was also a bug in Firefox prior to release 48. Not sure which is your case, but it's definitely browser or browser add-on related.
The details were were looking for you still have not provided by the way. After the verbose query, there will be a copy icon in the verbose output at the very right. Click that icon, and your verbose output will be moved to the clipboard. Then paste that output here.
The alternative is to do it from the command line to verify the data is coming back OK. So assuming you are using Net-SNMP - Get Monitored Partitions via snmp v2, it would be:
snmpwalk -c <community> -v 2c <ip> 1.3.6.1.4.1.2021.9.1
This will bring back all the various fields we are interested in from the device. Assuming that returns OK, as @cigamit suggests, we would really need the debug output from the verbose query to see what's being interpreted.
I would also recommend using the above walk command before and after you mount the partitions that cause the issue to see what the differences are.
The verbose query for one of the devices for NET-SNMP get mounted partition:
Data Query Debug Information
Total: 0.000000, Delta: 0.000000, Running data query [3]. Total: 0.000000, Delta: 0.000000, Found type = '3' [SNMP Query]. Total: 0.000000, Delta: 0.000000, Found data query XML file at '/var/www/html/resource/snmp_queries/net-snmp_disk.xml' Total: 0.000000, Delta: 0.000000, XML file parsed ok. Total: 0.000000, Delta: 0.000000,
missing in XML file, 'Index Count Changed' emulated by counting oid_index entries Total: 0.020000, Delta: 0.020000, Executing SNMP walk for list of indexes @ '.1.3.6.1.4.1.2021.9.1.1' Index Count: 12 Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.1' value: '1' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.2' value: '2' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.3' value: '3' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.4' value: '4' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.5' value: '5' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.6' value: '6' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.7' value: '7' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.8' value: '8' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.9' value: '9' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.10' value: '10' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.11' value: '11' Total: 0.020000, Delta: 0.000000, Index found at OID: '.1.3.6.1.4.1.2021.9.1.1.12' value: '12' Click to show Data Query output for field 'dskIndex'
Total: 0.020000, Delta: 0.000000, Located input field 'dskIndex' [walk] Total: 0.040000, Delta: 0.020000, Executing SNMP walk for data @ '.1.3.6.1.4.1.2021.9.1.1' Found item [dskIndex='1'] index: 1 [from value] Found item [dskIndex='2'] index: 2 [from value] Found item [dskIndex='3'] index: 3 [from value] Found item [dskIndex='4'] index: 4 [from value] Found item [dskIndex='5'] index: 5 [from value] Found item [dskIndex='6'] index: 6 [from value] Found item [dskIndex='7'] index: 7 [from value] Found item [dskIndex='8'] index: 8 [from value] Found item [dskIndex='9'] index: 9 [from value] Found item [dskIndex='10'] index: 10 [from value] Found item [dskIndex='11'] index: 11 [from value] Found item [dskIndex='12'] index: 12 [from value]
Click to show Data Query output for field 'dskPath'
Total: 0.040000, Delta: 0.000000, Located input field 'dskPath' [walk] Total: 0.060000, Delta: 0.020000, Executing SNMP walk for data @ '.1.3.6.1.4.1.2021.9.1.2' Found item [dskPath='/'] index: 1 [from value] Found item [dskPath='/var'] index: 2 [from value] Found item [dskPath='/'] index: 3 [from value] Found item [dskPath='/run'] index: 4 [from value] Found item [dskPath='/dev/shm'] index: 5 [from value] Found item [dskPath='/run/lock'] index: 6 [from value] Found item [dskPath='/sys/fs/cgroup'] index: 7 [from value] Found item [dskPath='/boot'] index: 8 [from value] Found item [dskPath='/opt'] index: 9 [from value] Found item [dskPath='/home'] index: 10 [from value] Found item [dskPath='/tmp'] index: 11 [from value] Found item [dskPath='/media/burak'] index: 12 [from value]
Click to show Data Query output for field 'dskDevice'
Total: 0.060000, Delta: 0.000000, Located input field 'dskDevice' [walk] Total: 0.080000, Delta: 0.020000, Executing SNMP walk for data @ '.1.3.6.1.4.1.2021.9.1.3' Found item [dskDevice='/dev/dm-0'] index: 1 [from value] Found item [dskDevice=''] index: 2 [from value] Found item [dskDevice='/dev/dm-0'] index: 3 [from value] Found item [dskDevice='tmpfs'] index: 4 [from value] Found item [dskDevice='tmpfs'] index: 5 [from value] Found item [dskDevice='tmpfs'] index: 6 [from value] Found item [dskDevice='tmpfs'] index: 7 [from value] Found item [dskDevice='/dev/xvda1'] index: 8 [from value] Found item [dskDevice='/dev/mapper/vg_sys-lv_opt'] index: 9 [from value] Found item [dskDevice='/dev/mapper/vg_sys-lv_home'] index: 10 [from value] Found item [dskDevice='/dev/mapper/vg_sys-lv_tmp'] index: 11 [from value] Found item [dskDevice='//10.42.248.2/IAV_WCDMA'] index: 12 [from value] Total: 0.080000, Delta: 0.000000, Update data query sort cache complete Total: 0.080000, Delta: 0.000000, Updated data query index ordering Total: 0.090000, Delta: 0.000000, Update re-index cache complete Total: 0.090000, Delta: 0.000000, Update graph data query cache complete Total: 0.090000, Delta: 0.000000, Update data source data query cache complete Total: 0.090000, Delta: 0.000000, Update data query cache complete Total: 0.090000, Delta: 0.010000, Update poller cache from query complete Total: 0.090000, Delta: 0.000000, Automation execute data query complete Total: 0.090000, Delta: 0.000000, Plugin hooks complete
Interesting that your root partition seems to be mapped twice. I've not seen that before. So if you managed the verbose query on a different device are you still getting the timeout when verbose querying the problematic device?
Yea, that is going to mess up the sorting. Likely a config issue. We can not fix a PEBKAC.
@netniV The timeout issue was connected with my browser so it was environmental problem not connected to SNMP. Are you able to tell me how can I debug this issue and get rid of doubled root partition in SNMP response? Performing df -h returns proper partitions amount:
user@user:~$ df -h Filesystem Size Used Avail Use% Mounted on /dev/dm-0 9.1G 4.6G 4.1G 53% / udev 10M 0 10M 0% /dev tmpfs 1.6G 9.2M 1.6G 1% /run tmpfs 4.0G 19M 3.9G 1% /dev/shm tmpfs 5.0M 4.0K 5.0M 1% /run/lock tmpfs 4.0G 0 4.0G 0% /sys/fs/cgroup /dev/xvda1 180M 55M 112M 33% /boot /dev/mapper/vg_sys-lv_opt 44G 2.1G 40G 6% /opt /dev/mapper/vg_sys-lv_tmp 14G 36M 13G 1% /tmp /dev/mapper/vg_sys-lv_home 28G 11G 18G 39% /home //10.45.249.2/SHARE 311T 276T 35T 89% /media/share1 tmpfs 800M 24K 800M 1% /run/user/1000
OK so the simple answer to this is that you are asking net-snmp to include it more than once. Now, you personally, probably haven't touched anything of that. But because you are using the default package, I guarantee that it shows:
disk / 10000
includeAllDisks 10%
Or something similar in /etc/snmp/snmpd.conf (your OS may be in a slightly different location). As soon as I removed the disk / 10000 by adding a hash at the front, my issue went away.
@netniV : After I change this on all affected VMs hosts should I remove and add the device again to Cacti? Or perform some other activity to let Cacti distinguish the change (despite snmpd restart on VM of course)?
The change should just be picked up on the next polling cycle.
Interesting - it is working :). I hope that partition won't change again in few days. If everything will be ok then I will mark the issue as closed, thanks!
Hello, looks that your advice resolved my issue. Thanks for all help!
Cool. If you can close the issue that would help :)
I'll go ahead and close. Almost done trolling for the day anyway.
Expected behavior: Cacti is measuring properly size of mounted partition of remote host, using NET-SNMP templates
Wrong behavior: When recache event occurs, Cacti changes monitored partition of remote host (OID address changes) because of what Cacti starts to monitor wrong partition.
Description: Hello, I'm using Cacti 1.1.36 installed on Debian 8 OS. I'm using the monitoring utility to check certain partition sizes of virtual machines. To do so, I've created bunch od devices inside Cacti where each of device is one VM. In next step I've generated few graphs, using Data Query [Net-SNMP - Get Monitored Partitions].
I've chosen two partitions to monitor (root and /home) but I have also few others on VM. When recache event occurs in Cacti logs then sometimes it happens that Cacti starts to point on wrong partition (it changes the OID address). For example after one recache event instead of /home partition Cacti is monitoring /tmp partition. This is very annoying because I'm also using thold plugin which sends email alerts to users when some thresholds are exceeded and when such wrong recache event occurs then completely different values are being checked by the plugin. Because of that user receive not proper email notifications.
Technical informations: