arrogantrabbit / freebsd_storj_installer

Installer script for Storj on FreeBSD
6 stars 1 forks source link

service status, etc. incorrect when running in jail ? #3

Closed bschwand closed 5 months ago

bschwand commented 5 months ago

Hi,

I just configured a jail running storj with your scripts, they are great. Storj doc should really mention it, and at least mention they already build native FreeBSD binaries... So the issue I have is that at the end of the script, the daemons are started and seem to run fine, I can see the processes on the host. However,

~]$ sudo service -j storj  storagenode status
 storagenode is not running.

in the jail, I get the same result Any idea what is going on ?

arrogantrabbit commented 5 months ago

Hmm. I’ve tried to start a new node today and it didn’t work either, seems like service start does nothing, while executing rc scripts manually works.

The only difference it was 13.3 jail, while the system is 13.1.

I’ll debug and get back to you. Thank you for letting me know!

bschwand commented 5 months ago

ok, great that you can reproduce the issue. BTW I am running on 14.0, host and jail

arrogantrabbit commented 5 months ago

It seems the daemon utility is broken since 13.3 onwards. No issues on 13.2 and before.

If you pass any argument to daemon that requires it to spawn a child process -- e.g. -r or -P, it silently fails to start the process, while still returning 0.

Demo:

This works on 13.2 and 13.3: daemon /bin/sh -c 'while true; do echo Test; sleep 10; done': daemon runs in the foreground.

This works on 13.2 but is broken on 13.3: daemon -r /bin/sh -c 'while true; do echo Test; sleep 10; done': daemon shall spawn the child and remain running to monitor it.

This is also broken on 13.3: daemon -P /var/run/test.pid /bin/sh -c 'while true; do echo Test; sleep 10; done': daemon shall spawn a child here too, and write its own pid to the file.

I'm not sure how to fix it -- it looks very much like a bug, but if it is some new behavior I will need to figure out how to launch a daemon properly in this new reality. sigh.

As a workaround, can you try doing it in 13.2 jail? If you mount the same folders the script shall just setup daemons, it's not going to create new identity. Or you can just downgrade the existing jail iocage upgrade <jailname> -r 13.2-RELEASE -- that's what I did -- and bam! it suddenly works.

arrogantrabbit commented 5 months ago

I've tried this in FreeBSD 14.0 and 13.3 VMs and the test above works. It seems to only not work when jail and host OS is mismatched. Can you confirm that both your host and jail are the same versions?

I'm going to try to test the whole node install on 14.0 VM. it might take some time generating identity etc.

bschwand commented 5 months ago

I can confirm I use 14.0 in both host and jail jail: FreeBSD storj 14.0-RELEASE-p5 FreeBSD 14.0-RELEASE-p5 #0: Tue Feb 13 23:37:36 UTC 2024 host FreeBSD proliant21.bschwand.net 14.0-RELEASE-p5 FreeBSD 14.0-RELEASE-p5 #0: Tue Feb 13 23:37:36 UTC 2024

arrogantrabbit commented 5 months ago

Thank you. I'll install these versions specifically in the VM tonight and test.

bschwand commented 5 months ago

As a workaround, can you try doing it in 13.2 jail? If you mount the same folders the script shall just setup daemons, it's not going to create new identity. Or you can just downgrade the existing jail iocage upgrade <jailname> -r 13.2-RELEASE -- that's what I did -- and bam! it suddenly works.

ok I'll look into that, I am using appjail, not iocage so I need to figure how I can do this and not mess up my existing setup :-) but still, that is not a real bug fix ;-)

It seems to be more of an issue with the jails and FreeBSD somehow, not the scripts ?

arrogantrabbit commented 5 months ago

Seems to be working on 14.0-RELEASE-p6

root@:~ # git clone https://github.com/arrogantrabbit/freebsd_storj_installer.git
...
root@:~ # cp -rv freebsd_storj_installer/overlay/ /
...
root@:~ # service storagenode_updater onestart
Starting storagenode_updater.

root@:~ # service storagenode_updater status
storagenode_updater is running as pid 1188.

root@:~ # freebsd-version
14.0-RELEASE-p6

Maybe you have installed freebsd update and did not reboot, so your kernel and user space is out of sync?

bschwand commented 5 months ago

I rebooted many times let me do an update then maybe from p5 to p6 something changed

arrogantrabbit commented 5 months ago

Can you try this in the command line?

daemon -P /var/run/test.pid  /bin/sh -c 'echo Stared; sleep 5;  echo Finished'

Does this work?

bschwand commented 5 months ago
root@storj:~/freebsd_storj_installer # daemon -P /var/run/test.pid  /bin/sh -c 'echo Stared; sleep 5;  echo Finished'
root@storj:~/freebsd_storj_installer # Stared
Finished

root@storj:~/freebsd_storj_installer #
arrogantrabbit commented 5 months ago

Great. So this is not a daemon issue that I was seeing. Let's go back to square one.

Do you have anything in the daemon logs? /var/log/storagenode.log and /var/log/storagenode_updater.log?

bschwand commented 5 months ago

yes, logs are growing, the daemons are running and filling the storage location. Like I said, it's running since the end of the install.sh script, but I can not stop or get status using the service command

bschwand commented 5 months ago

ok so I updated host and rebooted and I am at 14.0 p6

root@storj:~ # service storagenode_updater onestart Starting storagenode_updater. root@storj:~ # service storagenode_updater status storagenode_updater is not running. root@storj:~ # ls /var/run/ /var/run/bhyve/ /var/run/ld-elf.so.hints /var/run/ppp/ /var/run/utx.active /var/run/dhclient/ /var/run/ld-elf32.so.hints /var/run/storagenode_updater.pid /var/run/wpa_supplicant/ root@storj:~ # ls /var/run/storagenode_updater.pid /var/run/storagenode_updater.pid root@storj:~ # cat /var/run/storagenode_updater.pid 29639root@storj:~ #

I get the same results wether I run on the host or within the jail at the beginning of this storagenode_updater is running

[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater status
storagenode_updater is not running.
[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater stop
storagenode_updater not running? (check /var/run/storagenode_updater.pid).
[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater start
Starting storagenode_updater.
daemon: process already running, pid: 29639
/usr/local/etc/rc.d/storagenode_updater: WARNING: failed to start storagenode_updater
[bruno@proliant21 ~]$ sudo kill -9 29639
[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater start
Starting storagenode_updater.
[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater status
storagenode_updater is not running.
[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater stop
storagenode_updater not running? (check /var/run/storagenode_updater.pid).

somehow stop and status fails to recognize it running

arrogantrabbit commented 5 months ago

Can you try from inside the jail? The "service -j .. " never worked for me

You can try enabling debugging with sysrc rc_debug=YES, and then try stopping the service.

Can you stop other services the same way? Just the node does not get found and/or stoped?

bschwand commented 5 months ago

inside or outside the jail it gives me the same results. always thinks nothing is running, can't status or stop it

bschwand commented 5 months ago

argh.... solved. was all a jail config issue. Appjail does not enable this by default

        exec.start: "/bin/sh /etc/rc"
        exec.stop: "/bin/sh /etc/rc.shutdown jail"
        mount.devfs

so ps did not work, and services did not start at boot either. with that it's all working too, status, stop, run at jail start.

seems like there is some other issue left as you mentioned there https://github.com/arrogantrabbit/freebsd_storj_installer/issues/3#issuecomment-2057481878 but I think it's not to do with your scripts actually.

Thanks a lot for the help and back and forth !

arrogantrabbit commented 5 months ago

Awesome!

Can you actually get the storagenode_updater to stop? It seems I can't -- the shell that is running my replacement updater script is ignoring SIGTERM (by design, actually) so I would need to add _stop() function in the rc script and send SIGINT instead... But I guess it's irrelevant, as you can stop the whole jail instead, and everyone dies :)

bschwand commented 5 months ago

you are correct, updater gets hung

[bruno@proliant21 ~]$ sudo service -j storj storagenode_updater stop
Stopping storagenode_updater.
Waiting for PIDS: 74546

well it would be cleaner, also if the jail runs more stuff than just that service why use bash in that script, would sh behave the same way ? but... I see storagenode-updater uses /bin/sh ??

arrogantrabbit commented 5 months ago

All shells behave that way, it's by design.

The updater uses sh because is a hack I put together because official stor provided one did not / (still does not?) restart the service correctly after updating the binary.

I have the fix, will check it in later today

bschwand commented 5 months ago

yeah I just researched that a bit, I did not know, interesting.

arrogantrabbit commented 5 months ago

commits 7b8748a and 984c423 contain fix for stopping the updater on FreeBSD 13.3+ and 14+.

bschwand commented 5 months ago

updated and tested all combinations of start stop status restart all good !