Open bousqi opened 8 years ago
I just checked my own stats: I'm unfortunately not experiencing this issue:
First: are you using the latest script version? A git pull
should ensure you are.
Which version of Python are you using?
The graphs you're mentioning uses the /rdd
endpoint. It is the only endpoint where we need to specify date_start
and date_end
parameters, which are computed here: main.py#257
Could your system clock be out-of-sync from times to times? I had an issue where my Raspberry Pi time wasn't correct because of wrong NTP settings. I think that during these periods, the date_start_timestamp and date_end_timestamp aren't correctly computed.
For sure (and that's too bad for us), since the plugin isn't misbehaving (at least it thinks so), there's no error log anywhere.
I'm running on the master head of your git. I was thinking that python 3 was my default interpreter but in fact my system is using 2.7.9. I did checked with 3.4.2 and 2.7.9 and results are the same (stuck values). Your NTP remark is interesting. My raspberry clock seems to be correct, same date on 3 different systems (NTP synchronized). Maybe the freebox has a clock bias (it would explain that freebox reboot fix it, and that bias increase over the time). I'm thinking of it, but I don't know how to verify this. Any suggestion ?
Funny thing, graphics are also lost on freebox side... So it might not be an issue in getting values, but rather the plugin crashing the tracking on freebox.
I did not realized till now because I was just checking the temperature on first page, where values are ok :
Freebox server version is 3.3.3 (up to date).
Sure! Just create a test.py file with the following content:
import datetime
now = datetime.datetime.now() # math.ceil(time.time())
now = now.replace(second=0, microsecond=0)
date_end = now.replace(minute=now.minute - now.minute % 5) # Round to lowest 5 minutes
date_start = now - datetime.timedelta(minutes=5) # Remove 5 minutes from date_end
print(date_end)
print(date_start)
chmod & run it (with the same Python version as the one munin uses to be sure), and check if the dates are correct
rpi-stable:/usr/local/src/munin-freebox (master) $ date Fri Oct 7 14:12:37 CEST 2016 rpi-stable:/usr/local/src/munin-freebox (master) $ ./date.py 2016-10-07 14:10:00 2016-10-07 14:07:00
About your temperatures screenshot: that's really weird indeed. Maybe repeated API calls breaks data storage on the Freebox side. Then, our script isn't able to correctly read these values (or the API returns a stable value while there is none)
Could you try to disable our script for now, and check if Freebox OS's stats goes back to normal?
About your script output: the dates seems to be OK
Edit: these dates are not OK actually, you should have this: 2016-10-07 14:10:00 2016-10-07 14:05:00
all freebox plugins have been removed, and munin-node restarted. I'll wait for a few minutes/hours to check if freebox graphics are resurrected.
Alright, I'm fixing the dates issue - which isn't really one as long as the script is run when the minutes component of the current time is a multiple of 5
How many plugins are enabled on your munin server ? Is the freebox a classic one or an optical one ? last firmware ?
44 plugins, Freebox Revolution, last firmware
(tip: I'm using Material-Freebox-OS to spare my eyes when browsing Freebox OS)
Freebox Server (r2) ?
Till now the graphics are still dead on the freebox. I'll reboot it later. I guess i'll have to add one by one each plugin to check which one crash to tracking on the box.
About the rrd queries. Is it possible that Munin makes to many concurrent queries to rrd API ? Would it be possible to process them sequentially rather that in parallel ? What would be your approach to identify the origin of this problem ?
So the stats on the Freebox are still crashed. We may predict that everything will be OK after a reboot.
I never heard about too many concurrent queries being a problem with the rrd API, even with Freebox Stats... I'm not really sure if munin calls each plugin sequentially or in parallel. Though, since munin is responsible for this logic and we cannot override this, we have no maneuver margin here.
We don't have access to any Freebox OS log either, so our only solution seems to be opening on issue on Freebox OS's bug tracker.
It appears that a bug is still present in freebox statistics tracking. This time I don't have any clue on the issue (no exception nor error messages).
Here is the bug behavior : after a certain period of time (from 4 days up to 1 week), some statistics are stuck. Munin alway gets the same value while if I connect to the Freebox server, the value are different. It appears that restarting the freebox does fix the problem, but it is not really linked to the box itself (has the reported value on internal webserver are updated).
Here are some graph where you can see some area where value are stuck :
Values are updated when box has been restarted. This issue only concerns temperature, xdsl, traffix and switch.
Manully running plugins gives results, but not the correct one.