Closed danielfdickinson closed 7 years ago
@hnyman @jow- Oh and the issue I'm having with luci-app-statistics is that I get /usr/lib/lua/luci/dispatcher.lua:460: Failed to execute call dispatcher target for entry '/admin/statistics/graph/cpu'. The called action terminated with an exception:
/usr/lib/lua/luci/util.lua:196: attempt to get length of local 'str' (a nil value)
stack traceback:
[C]: in function 'assert'
/usr/lib/lua/luci/dispatcher.lua:460: in function 'dispatch'
/usr/lib/lua/luci/dispatcher.lua:141: in function </usr/lib/lua/luci/dispatcher.lua:140>
For selecting any actual graphing. Tracing back reveals that uci:get and uci:get_all are returning nil.
Ok, it doesn't look like the merges are pulling in bad trees, just some of them were based on fairly old history. I'm not sure what's up.
@cshoredaniel
To which commits you are referring, exactly??? Just saying that "merge graph via gitk and it looks rather abnormal" does not provide any useful details.
To my knowledge there has been no history rewriting via rebase / git force push in Luci.
Jow has actually used Github settings to protect all Luci branches against deletion and force push (and thus also rebase).
(To me the current network graph in Github looks strange/complicated mostly because you have so many rapid merge actions in cshore-firmware repo that it overflows everything else: https://github.com/openwrt/luci/network )
Ok, it doesn't look like the merges are pulling in bad trees, just some of them were based on fairly old history. I'm not sure what's up.
People do submit PR's from old sources, and also if the PR handling gets prolonged, the originally "fresh" branch may get old.
The line that causes the traceback is actually applications/luci-app-statistics/controller/luci_statistics/luci_statistics.lua line 145:
local spans = luci.util.split( uci:get( "luci_statistics", "collectd_rrdtool", "RRATimespans" ), "%s+", nil, true )
where luci.util.split is getting a nil value (hence the error message). The config file exists and command line uci get returns the expected '1hour 1day 1week ...'
[EDIT: Aside] Hmmm...I didn't realize actions in my fork affected the graphs for main repo. That's counter-productive. I'll have to alter workflow because that's not a good side effect.
This is why I'd probably make a good QA tester; I regularly do the unexpected which means I find code paths that have problems because the action wasn't considered.
[EDIT]: It seems the whole concept of keeping separate patch branches (for easier PR's and ensuring testing starts from upstream and adds only the current version of changes) that I auto-merge into a master working branch for testing doesn't play well with github.
I've built from scratch so it's not a needing distclean issue and also occurs with snapshot.
I am however embedding the config file in my image, which I wonder if is the issue.[EDIT] actually I used to embed the config, now the one on /rom is the default which is overwritten.
Perhaps some kind of overlayfs issue?
So far I've confirmed issue appears between Jan 9 2017 and last week. I suspect update to kernel 4.4.42 (overlayfs breakage?) and am testing that theory.
Oh, and yes, I've reviewed the diff of all merges I did; nothing looks relevant. [EDIT: Tried with all patches I committed reverted; no luck]
If it's not the kernel then I'll backtrack LuCI and see if it's LuCI or core.
with lede 1701 release everything seems ok (tested for ath9k and x86-64)
which collectd-mods are installed at your system?
im using:
root@vpnf:~# opkg list-installed | grep collectd collectd - 5.5.3-1 collectd-mod-conntrack - 5.5.3-1 collectd-mod-cpu - 5.5.3-1 collectd-mod-exec - 5.5.3-1 collectd-mod-interface - 5.5.3-1 collectd-mod-iwinfo - 5.5.3-1 collectd-mod-load - 5.5.3-1 collectd-mod-memory - 5.5.3-1 collectd-mod-network - 5.5.3-1 collectd-mod-olsrd - 5.5.3-1 collectd-mod-ping - 5.5.3-1 collectd-mod-rrdtool - 5.5.3-1 root@vpnf:~# root@vpnf:~# root@vpnf:~# uname -a Linux vpnf 4.4.50 #0 SMP Mon Feb 20 17:13:44 2017 x86_64 GNU/Linux root@vpnf:~# cat /etc/open* cat: read error: Is a directory DISTRIB_ID='LEDE' DISTRIB_RELEASE='17.01.0' DISTRIB_REVISION='r3205-59508e3' DISTRIB_CODENAME='reboot' DISTRIB_TARGET='x86/64' DISTRIB_ARCH='x86_64' DISTRIB_DESCRIPTION='LEDE Reboot 17.01.0 r3205-59508e3' DISTRIB_TAINTS='no-all' r3205-59508e3 root@vpnf:~# cat /tmp/sysinfo/* qemu-standard-pc-i440fx-piix-1996 QEMU Standard PC (i440FX + PIIX, 1996)
I get the same error message. It seems, it depends on the permission of /etc/config/luci_statistics. If permission is 0600 (owner RW), the error occurs. If permission is 0640(owner RW, group R), the error doesn't occure. I did check my configuration backups. Permission is since January 2016 0600. It did work in the past with this permission.
@e9hack Thanks for figuring out the permissions think; I couldn't figure out what was triggering the error; now to figure out why root user rw is not enough...
@e9hack @hnyman @jow- Actually if /etc/config is 0750 instead 0755 this also triggers the bug. It seems luci_statistics must drop privileges somewhere.
Actually if /etc/config is 0750 instead 0755 this also triggers the bug. It seems luci_statistics must drop privileges somewhere.
@jow- mentioned already in January in the other discussion about similar issues ( https://github.com/openwrt/luci/issues/677#issuecomment-274430189) that the data directory needs to be world-readable as the graph renderer runs as the user "nobody" . I guess that the same requirement extends to the config directory as it contains the rrd and graph config.
So far I've confirmed issue appears between Jan 9 2017 and last week. (quote from on Jan 23).
But I do not think that anything has changed in Luci statistics regarding this, or even in the main Luci. This requirement might always have been there and you just have never tried restricting permissions earlier, or alternatively something has changed elsewhere in LEDE(?) and that makes the impact now more visible.
I did check my configuration backups again. Something did occure between 5th and 6th January this year. The permissions of all files in /etc/config did change from 0644 to 0600. The date in my message about permissions was wrong. It is possible, that I didn't check luci/statistics between 6th January 17 and last week.
For me, I changed permissions recently but didn't make the connection, and I missed the fact that part of the graph rendering is as nobody instead of root. I don't like that I can't restrict all of config but it's not the end of the world, and I'll chaulk this up as 'missing documentation'.
Thanks @hnyman and @e9hack
For personal reasons no longer participating in OpenWrt/LEDE.
I was having these same problems with a fresh install of OpenWRT 17.01.4 and following the luci-app-statistics install instructions. I was able to fix the problem by explicitly specifying the "storage directory" and "Stored timespans" fields in the RRDTool Plugin Configuration. I used "/tmp/rrd" as the storage directory, and the default Stored Timespans of "600 86400 604800 2678400 31622400".
@hnyman @jow- There is an issue (below) I'm having with with luci-app-statistics in my tree that I have reproduced in the latest LEDE snapshot. I've looked at the merge graph via gitk and it looks rather abnormal to me for some of the commits. I had similar strange behaviour with lede-source at one point, and it seems to me that some kind of rebase or something has happened to the trees that confuses git merge.
I didn't detect this when I was testing the commits I pushed because I cherry-picked them into my test tree rather then merged branches (i.e. the diff that is supposed to come with the commit is correct, but there is some funkiness with differences between other people's trees (for a while including my own, including an archived copy I have because I'd like to dig deeper) and the GitHub trees that don't show up in changes files, but nonetheless change things).