Closed mgruben closed 9 years ago
hi mgruben,
I'd say that this [2015-05-15 21:42] [ALPM] upgraded rrdtool (1.4.9-1 -> 1.5.3-1)
could be the cause of this malfunction. I've not tested such new RRDtool version yet so it could introduce something new that Monitorix is not supporting right now.
Can you, please, paste your Monitorix log file right after the last start to see if there are some error messages that could give us a clue?
Thanks.
Your observation suggests that downgrading to 1.4.9-1 would fix the graphs. Indeed, this is the case: after issuing #pacman -U /var/cache/pacman/pkg/rrdtool-1.4.9-1-x86_64.pkg.tar.xz, and #systemctl restart monitorix, the "system" graph is now properly displaying values.
I will post the requested portion of my Monitorix system.rrd file this evening, when I have more time to teach myself proper rrdtool syntax (you made the awesome Monitorix; I'm not also going to make you teach me rrdtool)
Thanks for your feedback and glad to know that all is working again.
So if I understand correctly the only graph that was affected by the new 1.5 RRDtool branch is system
? I mean, the rest of graphs were working finely with RRDtool 1.5.3?
Just let me know.
As far as I can tell that is correct. I've also been trying to get the "disk" graphs working (both under rrdtool 1.4 and 1.5) for my new SSD, but I suspect the problem there exists between my keyboard and my chair.
On May 18, 2015, at 7:54 AM, Jordi Sanfeliu notifications@github.com wrote:
Thanks for your feedback and glad to know that all is working again.
So if I understand correctly the only graph that was affected by the new 1.5 RRDtool branch is system? I mean, the rest of graphs were working finely with RRDtool 1.5.3?
Just let me know.
— Reply to this email directly or view it on GitHub.
OK, please, don't forget to paste the log file right after a Monitorix start, just to see if there is an error message from the system
graph. Also, check the HTTP logfile (either the built-in server in monitorix-httpd.log
or from your external HTTP server, depends on what are you using).
Regarding the disk
graph feel free to ask whatever you need, either on a new issue, IRC, Mailing List or an email direct to me.
Regards.
rrdtool
is getting the better of me, so I thought I'd just link you to copies of the logs I've hosted in the interest of time, see http://76.209.20.97:9997/
The relevant times are 2015-05-16, at about 11am (this is when I restarted my machine, apparently thus completing the upgrade to the rrdtool
1.5 branch), and 2015-05-18 at about 7am (when I downgraded to the rrdtool
1.4 branch)
Thanks for sharing all this information. I think the problem is in this line:
Sun May 17 08:16:00 2015 - ERROR: while updating /var/lib/monitorix/system.rrd: /var/lib/monitorix/system.rrd: Function update_pdp_prep, case DST_GAUGE - Cannot convert '' to float
For some unknown reason there is a value that RRDtool is unable to convert to float and hence, the update process fails to save the values in the system.rrd
file. That's why you are getting NaN
values all the time.
The best way to help to fix this is to upgrade again to RRDtool 1.5 branch and start Monitorix adding the parameter -d system
in order to debug the values collected by the system
graph. Then, let me know the new log file.
Besides all this, I saw in your log file that you have enabled to get NVIDIA statistics but you don't have installed the NVIDIA official drivers. Please, disable the gpu0
key in the lmsens
graph or install the NVIDIA drivers.
Can't exec "nvidia-smi": No such file or directory at /usr/lib/monitorix/Monitorix.pm line 177.
Sun May 17 08:15:00 2015 - Monitorix::get_nvidia_data: ERROR: 'nvidia-smi' command is not installed.
Can't exec "nvidia-smi": No such file or directory at /usr/lib/monitorix/lmsens.pm line 280.
Also, it looks like you have enabled the lighttpd
graph but you have not configured the Lighttpd server properly.
Sun May 17 08:15:00 2015 - lighttpd::lighttpd_update: ERROR: Unable to connect to 'http://localhost:8078/server-status?auto'.
All in all, please read the monitorix.conf(5)
man page and enable only the graphs that cover your real resources.
Thanks.
(1) Your help is much appreciated, thank you for taking the time to go through this with me,
(2)(a) In Arch linux, I modified the monitorix.service
file in /etc/systemd/system/multi-user.target.wants to read, in part, ExecStart=/usr/bin/monitorix -c /etc/monitorix/monitorix.conf -p /run/monitorix.pid -d system
(adding the -d system to the end, as it was not present before). I then stopped the monitorix
service, issued a systemctl daemon-reload
, and restarted the monitorix
service.
(b) I then issued a pacman -U ~rrdtool1.5~
, followed by a systemctl restart monitorix.service
. As expected, the system
graphs again are not displaying current values.
(c) I've copied system.rrd
, /var/log/monitorix
, and /var/log/monitorix-httpd
to my aforementioned host, http://76.209.20.97:9997/ for your review
(3) Regarding NVIDIA, I thought erroneously that since my nvidia
value under the <graph_enable>
section is set to n
, that I had disabled calls to NVIDIA statistics; I have now removed the line reading gpu0 = nvidia
from the lmsens
list (since I actually have a radeon card and the nvidia
drivers wouldn't help me anyway)
(4) I (very) occasionally use lighttpd
in conjunction with a Project Gutenberg database I'm working on, and haven't disabled the lighttpd
entry in the <graph_enable>
section, but more out of wishful thinking on my part that I'll actually have time to devote to that project. I have now disabled that entry, both because you've suggested it and because I really don't use lighttpd
enough to justify leaving the logging on
Well, after reading your new log file I think that I have found the cause of the problem:
Tue May 19 07:47:00 2015 - system::system_update: N:0.18:0.22:0.29:224:223:1:0:0:0:0:7918332:135032:4155804:2786868:::0:0:0:0:0
Tue May 19 07:47:00 2015 - ERROR: while updating /var/lib/monitorix/system.rrd: /var/lib/monitorix/system.rrd: Function update_pdp_prep, case DST_GAUGE - Cannot convert '' to float
As you can see, there are 2 values that are undefined, and while this hasn't been a problem for the old stable branch 1.4 of RRDtool, it looks like the new 1.5. branch can't deal with it.
The above commit fixes that problem, so if you download the current system.pm
the new RRDtool 1.5 branch should work finely.
Feel free to overwrite your current system.pm
with this new one, and let me know how it works.
Thanks.
I downloaded the 'raw' system.pm
from the above commit into ~/system.pm
, then # chmod 644 ~/system.pm
, # chown root:root system.pm
, # mv /usr/lib/monitorix/system.pm{,.bak}
, and finally # mv ~/system.pm /usr/lib/monitorix/system.pm
I see actual, non-NaN values in the three "System load average and usage" graphs, so on its face this commit has fixed my issue with rrdtool 1.5.3-1
.
I have also copied /var/lib/monitorix/system.rrd
, /var/log/monitorix
, and /var/log/monitorix-httpd
into my host for your review.
As far as I can tell, this issue has been solved.
Perfect, thanks for your feedback. Best regards.
Yesterday, the three graphs in the "System load average and usage" section (system.rrd) started reporting only nan values, while other graphs appear to have been unaffected. (xref this thread)
The only thing I can think of that would have interfered with Monitorix' recording abilities is upgrading a variety of packages on my host system (below).
I'm happy to provide additional information, just not sure what would be helpful