Closed VivaldiKF closed 4 years ago
Hi,
i have the same problem here also.
pkg install pkgconf
pkg install bash
pkg install e2fsprogs-libuuid
pkg install libuv
pkg add http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/Judy-1.0.5_2.txz
pkg add http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/python36-3.6.9_3.txz
ln -s /usr/local/lib/libjson-c.so /usr/local/lib/libjson-c.so.4
pkg add http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/netdata-1.19.0.txz
i changed to bind to = 0.0.0.0
in the netdata.conf
service netdata onestart
outputs:
Starting netdata.
Bad -c option
/usr/local/etc/rc.d/netdata: WARNING: failed to start netdata
I'll have a look into this...
Also discovered one of the instructions is either wrong or the upstream FreeBSD package has moved or changed:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: pkg add http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/python36-3.6.9.txz
pkg: http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/python36-3.6.9.txz: Not Found
Hi @VivaldiKF
I am unable to repro your exact issue:
Bad -c option
See:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: service netdata onestart
Starting netdata.
[2.4.4-RELEASE][root@pfSense.localdomain]/root: ps aux | grep netdata
netdata 97253 3.0 2.8 31540 28144 - SN 03:52 0:00.59 /usr/local/bin/python3.6 /usr/local/libexec/netdata/plugins.d/python.d.plugin 1
netdata 95608 0.0 1.6 22372 15904 - SN 03:52 0:00.09 /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
netdata 97891 0.0 0.7 12728 7448 - SN 03:52 0:00.00 /usr/local/libexec/netdata/plugins.d/apps.plugin 1
root 34574 0.0 0.0 408 324 0 R+ 03:52 0:00.00 grep netdata
I followed the same steps as you. Are you able to give this another go and let me know if the problem still persists? (_its possible it was fixed and a hot fixed package was pushed the FreeBSD Package Repo
FWIW I have the same command_args
that contains the -c
option:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: grep '\-c' < /usr/local/etc/rc.d/netdata
command_args="-c -f ${procname} -u ${netdata_user} -P ${netdata_pid} ${netdata_args}"
-c
is suppose to be an option to the daemon tool:
-c Change the current working directory to the root ("/").
Its unclear to me why this isn't working on your pfSense environment for either of you @VivaldiKF @rdnsx -- Can either of you show me the output of daemon --help
on your systems?
FYI - a long discussion on the subject is in #3469.
Hi @prologic,
thank you for your effort, i really appreciate it. I´ll gave it another go, but have the same issue. My first attempt was 10 hours ago from now and the second attemp 1 hour ago from now. So i´m sure i was using the latest netdata package. (Last push was 2020-Jan-01 04:50)
Here is the output of daemon --help:
daemon --help
daemon: illegal option -- -
usage: daemon [-cfrS] [-p child_pidfile] [-P supervisor_pidfile]
[-u user] [-o output_file] [-t title]
[-l syslog_facility] [-s syslog_priority]
[-T syslog_tag] [-m output_mask] [-R restart_delay_secs]
command arguments ...
This seems not to be helpfull ;)
Can either/both of you confirm the output of uname -a
on your pfSense instances?
[2.4.4-RELEASE][root@pfSense.localdomain]/root: uname -a
FreeBSD pfSense.localdomain 11.2-RELEASE-p10 FreeBSD 11.2-RELEASE-p10 #9 4a2bfdce133(RELENG_2_4_4): Wed May 15 18:54:42 EDT 2019 root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-244/obj/amd64/ZfGpH5cd/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/sys/pfSense amd64
Unless I'm running something different to you I can't explain the Bad -c option
and you also confirm you have the -c
option for the daemon
tool.
Here's my uname -a output:
[2.4.4-RELEASE][root@pfsense. localdomain]/root: uname -a
FreeBSD pfsense.localdomain 11.2-RELEASE-p10 FreeBSD 11.2-RELEASE-p10 #9 4a2bfdce133(RELENG_2_4_4): Wed May 15 18:54:42 EDT 2019 root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-244/obj/amd64/ZfGpH5cd/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/sys/pfSense amd64
Yeah so its pretty much the same version as what I have here. Do you mind doing a refresh install at all? (fresh re-install of pfSense that it)
Its running in a company environment without failover unit at the moment. When a failover unit is there, maybe.
But I can't reinstall pfsense, install netdata and than push the backup to the pfsense machine, just like this.
I'll have a think about this some more... Meanwhile can either of you try something for me? Please edit the startup config usr/local/etc/rc.d/netdata
and remove the -c
option from command_args
. From my understanding of the FreeBSD daemon
tool this option is just changing the CWD of the process to /
which isn't super important (I don't think) for a correctly functioning NetData instance.
I removed the -c
option from command_args=
from the start config /usr/local/etc/rc.d/netdata
Now it looks like:
command_args="-f ${procname} -u ${netdata_user} -P ${netdata_pid} ${netdata_args}"
service netdata onestart
Starting netdata.
Bad -c option
/usr/local/etc/rc.d/netdata: WARNING: failed to start netdata
I double checked the -c
option in the start config is gone
Oooh :) Wait just a minute... Can you paste me the contents of your /usr/local/etc/rc.d/netdata
after undoing your change (_re-add the -c
option back to command_args
_)
#!/bin/sh
#
# $FreeBSD: head/net-mgmt/netdata/files/netdata.in 496470 2019-03-21 15:05:27Z mmokhi $
#
# PROVIDE: netdata
# REQUIRE: LOGIN # KEYWORD: shutdown
#
# Add the following line to /etc/rc.conf to enable netdata:
# netdata_enable (bool): Set to "NO" by default.
# Set it to "YES" to enable netdata.
# netdata_args (str): Custom additional arguments to be passed
# to netdata (default empty).
#
. /etc/rc.subr
name="netdata"
rcvar=netdata_enable
load_rc_config $name
: ${netdata_enable="NO"} : ${netdata_user="netdata"}
: ${netdata_pid="/var/db/netdata/${name}.pid"}
procname="/usr/local/sbin/${name}"
command="/usr/sbin/daemon"
command_args="-c -f ${procname} -u ${netdata_user} -P ${netdata_pid} ${netdata_args}"
required_files="/usr/local/etc/netdata/${name}.conf"
Can you confirm that /etc/rc.conf.d
is empty on your system? It is on mine too; so I'm a bit puzzled as to where/how netdata_args
is coming from and what it is on your system vs. mine (where mine works okay)
Can you confirm you can start netdata manually with:
/usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
And confirm with ps aux | grep netdata
and try to hit the Web on port :19999
(the default)
/etc/rc.conf.d
is empty.
/usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
2020-01-04 01:05:27: netdata INFO : MAIN : SIGNAL: Not enabling reaper
ps aux | grep netdata
netdata 86487 0.1 0.2 36836 17860 - IN 01:05 0:01.46 /usr/local/sbi
netdata 1748 0.0 0.3 38108 25564 - SN 01:05 0:01.23 /usr/local/bin
netdata 88025 0.0 0.1 12728 7260 - SN 01:05 0:00.08 /usr/local/lib
root 16778 0.0 0.0 6564 2384 0 S+ 01:07 0:00.00 grep netdata
Also available under :19999
Okay good! That works. Now I just have to continue root causing two problems (I found another):
Bad -c option
is coming from and fix itservice netdata start
complainsThanks so far sir. If i can help, just let me know.
Okay good! That works. Now I just have to continue root causing two problems (I found another):
* [ ] Figure out where/how the `Bad -c option` is coming from and fix it * [ ] Figure out how to auto-start netdata on boot `service netdata start` complains
prologic: On PFSense boxes it's 'service netdata onestart'
prologic: On PFSense boxes it's 'service netdata onestart'
Doesn't this only do a "one shot" run of the service? Wouldn't you want to automatically start on boot too?
Having this problem as well on a fresh install of Pfsense 2.4.4-RELEASE-p3.
Can confirm that removing the -c from nano /usr/local/etc/rc.d/netdata does not resolve this issue. There is something wrong with how this command is interfacing with /etc/rc.subr. Not 100% certain with what's going wrong, but I think it has to do with the run_rc_command function at lines L1061 - 1069
if [ -n "$_user" ]; then
_doit="su -m $_user -c 'sh -c \"$_doit\"'"
fi
if [ -n "$_nice" ]; then
if [ -z "$_user" ]; then
_doit="sh -c \"$_doit\""
fi
_doit="nice -n $_nice $_doit"
fi
In fact, when I run
truss -dae service netdata onestart
The portion that fails reads:
0.072540115 eaccess("/usr/local/etc/netdata/netdata.conf",R_OK) = 0 (0x0)
Starting netdata.
0.072696904 write(1,"Starting netdata.\n",18) = 18 (0x12)
0.072853834 stat("/sbin/limits",0x7fffffffdf08) ERR#2 'No such file or directory'
0.072933153 stat("/bin/limits",0x7fffffffdf08) ERR#2 'No such file or directory'
0.073008333 stat("/usr/sbin/limits",0x7fffffffdf08) ERR#2 'No such file or directory'
0.073083373 stat("/usr/bin/limits",{ mode=-r-xr-xr-x ,inode=12923332,size=20880,blksize=32768 }) = 0 (0x0)
0.073185053 vfork() = 38855 (0x97c7)
Bad -c option
0.083937599 wait4(-1,{ EXITED,val=2 },0x0,0x0) = 38855 (0x97c7)
When I run:
truss -fdae service netdata onestart
I get:
24980: 0.473577151 execve("/bin/sh",[ "/bin/sh", "-c", "sh", "-c", ""/usr/sbin/daemon", "-c", "-f", "/usr/local/sbin/netdata", "-u", "netdata", "-P", "/var/db/netdata/netdata.pid", """ ],[ "PATH=/sbin:/bin:/usr/sbin:/usr/bin", "PWD=/", "HOME=/", "RC_PID=23287" ]) = 0 (0x0)
The command that is failing comes from a combination of /etc/rc.subr and/usr/local/etc/rc.d/netdata and is:
/bin/sh -c /usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
But, if I run:
/usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
The netdata process starts successfully.
It looks to me to be the segment:
/bin/sh -c
that is failing. The man page for this command reads that:
-c string Read commands from string.
Not sure how to fix it, but I think this is the problem.
A temporary fix for this issue so that users of 2.4.4-RELEASE-p3 can use Netdata is to ignore the Shellcmd step from the instructions. Instead, paste the following command:
/usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
into the Shellcmd service instead of:
service netdata onestart
You'll need to run that command as root from a shell. I logged in through SSH, but it will work just fine if you connect with a VGA cable and use a keyboard.
See the screenshot attached screenshot.
The comment provided by @mmangione is the most insightful at this point; and I will look into this further tomorrow. The trouble I've had so far is my inability to reproduce this Bad -c option
on any of my test pfSense virtual environments :/ I have to admit its got me a bit stumped; but will continue digging into this tomorrow.
This seems to be related to #4265 and #3469.
This seems to be the answer:
https://github.com/netdata/netdata/issues/3469#issuecomment-457759807
Can confirm @mmangione 's findings:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: /bin/sh -c /usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
Bad -c option
[2.4.4-RELEASE][root@pfSense.localdomain]/root:
[2.4.4-RELEASE][root@pfSense.localdomain]/root: /usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
[2.4.4-RELEASE][root@pfSense.localdomain]/root: ps aux | grep netdata
netdata 10705 11.9 2.7 31552 27148 - SN 03:26 0:00.56 /usr/local/bin/python3.6 /usr/local/libexec/netdata/plugins.d/python.d.plugin 1
netdata 9356 0.0 1.7 24420 16664 - SN 03:26 0:00.09 /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
netdata 11222 0.0 0.7 12728 7328 - SN 03:26 0:00.00 /usr/local/libexec/netdata/plugins.d/apps.plugin 1
netdata 60693 0.0 1.7 24932 17408 - IN 03:19 0:01.28 /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
netdata 62477 0.0 3.6 40416 35572 - SN 03:19 0:01.47 /usr/local/bin/python3.6 /usr/local/libexec/netdata/plugins.d/python.d.plugin 1
netdata 62753 0.0 0.8 12728 7608 - SN 03:19 0:00.20 /usr/local/libexec/netdata/plugins.d/apps.plugin 1
root 45005 0.0 0.0 408 324 0 R+ 03:26 0:00.00 grep netdata
[2.4.4-RELEASE][root@pfSense.localdomain]/root: service netdata stop
Stopping netdata.
Waiting for PIDS: 9356 60693.
[2.4.4-RELEASE][root@pfSense.localdomain]/root: ps aux | grep netdata
root 50620 0.0 0.0 408 324 0 R+ 03:26 0:00.00 grep netdata
[2.4.4-RELEASE][root@pfSense.localdomain]/root: /bin/sh -c "/usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid"
[2.4.4-RELEASE][root@pfSense.localdomain]/root: ps aux | grep netdata
netdata 71261 15.8 2.7 31552 27136 - SN 03:26 0:00.56 /usr/local/bin/python3.6 /usr/local/libexec/netdata/plugins.d/python.d.plugin 1
netdata 68876 0.0 1.7 24420 16844 - SN 03:26 0:00.10 /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
netdata 71413 0.0 0.7 12728 7340 - SN 03:26 0:00.00 /usr/local/libexec/netdata/plugins.d/apps.plugin 1
root 8425 0.0 0.0 408 324 0 R+ 03:26 0:00.00 grep netdata
[2.4.4-RELEASE][root@pfSense.localdomain]/root:
@mmangione Would we be able to schedule some time with you over a call or Slack/IRC/Messenger/Signal/whatever to help fix/resolve this? I'm still having a hard time reproducing this on any pfSense environment I can produce; but I'm convinced there is some subtle differences.
@prologic Yep. I can hop on any of the above. Whichever is most convenient. My evenings are usually pretty good. What does your schedule look like?
Pretty clear. I'll keep it open to tee up with you. I want to get this resolved! I'm @prologic
on FreeNode and I'm hanging around on the #freebsd channel. Feel free to PRIVMSG me. Just give me a time window :)
@prologic does this mean that we can remove the 'Cannot reproduce' label? Please keep this ticket updated if there's anything new.
@mmangione Sorry it looks like we missed each other; I'm always around on FreeNode prologic
and as I still cannot repro this issue exactly on any BSD/pfSense environment I can create here I'm going to close this as "cannot reproduce". If this is still an issue for anyone please either re-open or file a new issue. Please feel free to reach out to me on FreeNode any time or email me at james at netdata dot cloud and we can get to this bottom of this.
Funny enough I landed on this issue report before I experienced the issue. Now I am back to report it has recreated for me. I would be happy to help try to get to the bottom of this.
[2.4.4-RELEASE][admin@pfSense.***]/root: uname -a
FreeBSD pfSense.*** 11.2-RELEASE-p10 FreeBSD 11.2-RELEASE-p10 #9 4a2bfdce133(RELENG_2_4_4): Wed May 15 18:54:42 EDT 2019 root@buildbot1-nyi.netgate.com:/build/ce-crossbuild-244/obj/amd64/ZfGpH5cd/build/ce-crossbuild-244/pfSense/tmp/FreeBSD-src/sys/pfSense amd64
[2.4.4-RELEASE][admin@pfSense.***]/root: pkg info | egrep 'pkgconf|bash|e2fsprogs-libuuid|libuv|Judy|python3|netdata'
Judy-1.0.5_2 General purpose dynamic array
bash-4.4.23 GNU Project's Bourne Again SHell
e2fsprogs-libuuid-1.44.4 UUID library from e2fsprogs package
libuv-1.21.0 Multi-platform support library with a focus on asynchronous I/O
netdata-1.13.0 Scalable distributed realtime performance and health monitoring
pkgconf-1.4.2,1 Utility to help to configure compiler and linker flags
python36-3.6.8_1 Interpreted object-oriented programming language
[2.4.4-RELEASE][admin@pfSense.***]/root: ls /etc/rc.conf.d/
[2.4.4-RELEASE][admin@pfSense.***]/root:
[2.4.4-RELEASE][admin@pfSense.***]/root: cat /usr/local/etc/rc.d/netdata
#!/bin/sh
#
# $FreeBSD: branches/2019Q2/net-mgmt/netdata/files/netdata.in 496470 2019-03-21 15:05:27Z mmokhi $
#
# PROVIDE: netdata
# REQUIRE: LOGIN
# KEYWORD: shutdown
#
# Add the following line to /etc/rc.conf to enable netdata:
# netdata_enable (bool): Set to "NO" by default.
# Set it to "YES" to enable netdata.
# netdata_args (str): Custom additional arguments to be passed
# to netdata (default empty).
#
. /etc/rc.subr
name="netdata"
rcvar=netdata_enable
load_rc_config $name
: ${netdata_enable="NO"}
: ${netdata_user="netdata"}
: ${netdata_pid="/var/db/netdata/${name}.pid"}
procname="/usr/local/sbin/${name}"
command="/usr/sbin/daemon"
command_args="-c -f ${procname} -u ${netdata_user} -P ${netdata_pid} ${netdata_args}"
required_files="/usr/local/etc/netdata/${name}.conf"
run_rc_command "$1"
Re-opening ! Thanks! Let's work to get to the bottom of this!
As I still cannot reproduce and haven't heard from @eliezerlp in some days I'm re-closing this. There is on-going work to properly support FreeBSD in #8304 (and other issues) so we'll hopefully have better support for FreeBSD there and eventually pfSense (I hope); but I cannot repro this issue with any pfSense I can manage to install.
I'll just put this here... hope it helps further resolving as I'm also experiencing the same issue: root: /usr/local/sbin/netdata /usr/local/lib/libjson-c.so.5: version JSONC_0.14 required by /usr/local/sbin/netdata not defined
and temp solution: pkg delete -f json-c-0.14 pkg add http://pkg.freebsd.org/FreeBSD:11:amd64/latest/All/json-c-0.15_1.txz then shellcmd: /usr/sbin/daemon -c -f /usr/local/sbin/netdata -u netdata -P /var/db/netdata/netdata.pid
Hey @sniffski,
Thanks for the update and welcome to our community! I think Netdata should work with pfsense, pinging @Ferroin in case we need to add this solution to our docs.
Cheers
Bug report summary
Unable to install netdata version
1.19.0
onpfSense 2.4.4-RELEASE-p3
.I used the following commands, inspired by the recommended documentation:
I received the following
missmatch
warnings:Because the documentation did not provide instructions on how to deal with the missmatch condition, I selected
Y
each time the warning appeared.After updating the
bind to
in/usr/local/etc/netdata/netdata.conf
, I tried to start the service:OS / Environment
pfSense 2.4.4-RELEASE-p3
Expected behavior
The netdata service successfully starts, without warnings or errors, and listens on port 19999.