Closed cakrit closed 1 year ago
The initial todo is to improve the README to explain how users should start/stop/restart it.
After a system restart Netdata did run, though I'm not sure how. Is it just because the image Netdata has it by default? It should be documented in the README
As the netdata.tar image that becomes the Netdata WSL distro is based on the docker image, the MSI adds to the Windows registry a startup entry for start-netdata.cmd
that has the command wsl -d netdata netdata
that starts the distro and runs the netdata
binary found in the path.
After the installation, I saw that directory C:\Program Files (x86)\Netdata includes some nice commands. I don't know if they're all correct, @Ferroin please check.
Scripts are used during installation along other commands embedded in the MSI installer, they might be used by the end user but that wasn't intended. I'll check a better process to start/stop/restart than restarting Windows.
Also, I don't see the parent child relationship in the process tree between netdata and its children (spawn server and go.d.plugin): Why is that? I did see the process tree when I did the manual installation:
DESKTOP-C7OKV71:/usr/libexec/netdata/plugins.d# ps faux PID USER TIME COMMAND 1 root 0:00 /init 7 root 0:00 /init 10 netdata 0:00 netdata 12 netdata 0:00 /usr/sbin/netdata --special-spawn-server 251 netdata 0:00 /usr/libexec/netdata/plugins.d/go.d.plugin 1 338 root 0:00 /init 339 root 0:00 /init 340 root 0:00 -ash 379 root 0:00 ps faux
Certainly it doesn't have the parent/child here as well, I'm not sure why is that, maybe because the netdata wsl distro is generated from the docker image. As the start/stop/restart processes would be better handled as a whole for the distro this shouldn't be a problem.
I'll improve the README, specially for the stop/restart procedures.
The initial todo is to improve the README to explain how users should start/stop/restart it. After a system restart Netdata did run, though I'm not sure how. Is it just because the image Netdata has it by default? It should be documented in the README
As the netdata.tar image that becomes the Netdata WSL distro is based on the docker image, the MSI adds to the Windows registry a startup entry for
start-netdata.cmd
that has the commandwsl -d netdata netdata
that starts the distro and runs thenetdata
binary found in the path.
Is there any way we could integrate that as a Windows Service instead of a startup entry? Not sure how feasible that is, but if possible that would let users manage it the way they are probably already used to managing such things.
If not, then this is probably the cleanest option for auto-starting the agent.
After the installation, I saw that directory C:\Program Files (x86)\Netdata includes some nice commands. I don't know if they're all correct, @Ferroin please check.
Scripts are used during installation along other commands embedded in the MSI installer, they might be used by the end user but that wasn't intended. I'll check a better process to start/stop/restart than restarting Windows.
Unless we can get some way to integrate with Windows’ native service management as I suggested above, I would argue that just having scripts in here to handle starting/stopping/restarting the agent is fine, though we probably want to add a PATH entry for the directory if we’re doing that.
Also, I don't see the parent child relationship in the process tree between netdata and its children (spawn server and go.d.plugin): Why is that? I did see the process tree when I did the manual installation:
DESKTOP-C7OKV71:/usr/libexec/netdata/plugins.d# ps faux PID USER TIME COMMAND 1 root 0:00 /init 7 root 0:00 /init 10 netdata 0:00 netdata 12 netdata 0:00 /usr/sbin/netdata --special-spawn-server 251 netdata 0:00 /usr/libexec/netdata/plugins.d/go.d.plugin 1 338 root 0:00 /init 339 root 0:00 /init 340 root 0:00 -ash 379 root 0:00 ps faux
Certainly it doesn't have the parent/child here as well, I'm not sure why is that, maybe because the netdata wsl distro is generated from the docker image.
Correct, it’s because it’s based on the Docker image. Busybox ps
doesn’t display threads by default, and every child process of the main process other than the Go plugin should be a thread with the configuration we’re using here.
README includes the restart command now, I'll see if agent can be started as a Windows service.
This command causes Netdata to lose data, because it's not terminated properly. Upon receiving the kill signal, netdata stores the in-memory pages to the db and then exits. wsl -t
prevents it from doing that.
We can do wsl -d netdata killall netdata
and after that completes, wsl -t netdata & wsl -d netdata netdata
I tried putting all three in the same line, but it doesn't work, it loses data again.
You can see compare the behaviors when you have the UI open at localhost:19999.
What about the windows service? Is that possible? Would that help? @Ferroin I thought we were talking about some init.d commands, aren't those available?
This command causes Netdata to lose data, because it's not terminated properly. Upon receiving the kill signal, netdata stores the in-memory pages to the db and then exits.
wsl -t
prevents it from doing that.We can do
wsl -d netdata killall netdata
and after that completes,wsl -t netdata & wsl -d netdata netdata
I tried putting all three in the same line, but it doesn't work, it loses data again. You can see compare the behaviors when you have the UI open at localhost:19999.
This doesn’t work because killall
just sends the signals. Nothing in that case is actually waiting for the agent to exit.\
What about the windows service? Is that possible? Would that help?
We would still need a command to cleanly shut down the agent and wait for it to shut down.
@Ferroin I thought we were talking about some init.d commands, aren't those available?
Those wouldn’t be available if we’re using the Docker images as a base, though they would in theory solve the issue.
We just need something to wait (up to some configurable timeout, probably with a 60 second default timeout) for the agent to exit after telling it to exit.
I'll check how to restart without losing data, maybe mimicking the init.d daemon stop command that uses killproc.
For the Windows service, having WSL available before a user logins seems to be quite complicated as distros are installed per-user (see https://github.com/microsoft/WSL/issues/2979).
There are tools online that claim to aid in this (https://github.com/peppy0510/wsl-service) that I could look at.
Creating a service account, installing wsl there and having Task Scheduler run the start command for that user is something I'll try this weekend, so far it didn't work for the SYSTEM account.
Ok, let's see where you'll get with that.
I'll check how to restart without losing data, maybe mimicking the init.d daemon stop command that uses killproc.
The key thing here is largely just ensuring that the main netdata
process has exited before tearing down the WSL environment. If all else fails, repeatedly calling pgrep netdata
in the WSL environment until that returns no output (or some timeout is reached) should work, though it’s not exactly the ideal solution here.
README has been updated with improved start/stop/restart commands, please test @cakrit if data is lost when restarting.
netdatacli doesn't work and spawns something that eats a lot of cpu. I just entered killall netdata
in the README.
FYI we have an issue with WSL1, see https://github.com/netdata/netdata/issues/13933 if you're interested.
The initial todo is to improve the README to explain how users should start/stop/restart it.
After a system restart Netdata did run, though I'm not sure how. Is it just because the image Netdata has it by default? It should be documented in the README
After the installation, I saw that directory C:\Program Files (x86)\Netdata includes some nice commands. I don't know if they're all correct, @Ferroin please check.
Also, I don't see the parent child relationship in the process tree between netdata and its children (spawn server and go.d.plugin): Why is that? I did see the process tree when I did the manual installation: