openhab / openhab-linuxpkg

Repo for Linux packages
Eclipse Public License 2.0
18 stars 33 forks source link

Systemd service should use /bin/karaf instead of start.sh #72

Closed BClark09 closed 6 years ago

BClark09 commented 7 years ago

Closes #70

Signed-off-by: Ben Clark ben@benjyc.uk

BClark09 commented 7 years ago

@ThomDietrich if it helps for testing, the above changes are available as a full pacakge in the linuxpkg-testing repo:

echo 'deb https://openhab.jfrog.io/openhab/linuxpkg-testing unstable main' | sudo tee /etc/apt/sources.list.d/openhab2.list
BClark09 commented 7 years ago
OPENHAB_USERDATA=WHEREAMI

Before PR: image

After PR: image

Seems to have gone well!

ThomDietrich commented 7 years ago

Excellent! I'm sadly not able to do any tests today. I'm curious how the error would look like if there is no java runtime available

BClark09 commented 7 years ago

There's a weird issue with this:

When a program managed by systemd sources a script, the stout buffer from that script seems delayed. In this case, when karaf calls our setenv script the echo 'Java version higher than 1.8...' does not get the chance to print when exit 1; follows.

To see this error message in systemd, we would have to edit 'setenv' to output with:

echo "Error message" && /bin/true

Which is otherwise completely redundant.

ThomDietrich commented 6 years ago

@BClark09 If it is like you suggested and the buffered stdout is cut off too early at the exit, did you find a discussion on the issue? This one looks alright: http://systemd-devel.freedesktop.narkive.com/suwZf7s7/unbuffered-stderr-for-my-systemd-service

Your proposed solution does work because the otherwise useless && /bin/true will delay following commands? Is there a cleaner and more intuitive way to solve this? sd_notify would be an option for systemd. Another option might be sleep but that's hack'y...

BClark09 commented 6 years ago

There are a few scattered around (this for example), I think this is the systemd bug to watch for this issue though.

BClark09 commented 6 years ago

I think this is okay to merge now. I'll wait a bit for feedback in the above thread to see if the ulimit restriction has been adjusted.

BClark09 commented 6 years ago

@kaikreuzer, @theoweiss, just an FYI: I noticed there was some discussion at https://github.com/openhab/openhab-distro/pull/138 about how to start using systemd. I have changed this to use runtime/bin/karaf daemon so that systemd recognises java as the main process to follow. It is also the recommended way using the karaf service template by the looks of things.

Is there any reason, or can you remember any reason for this being a bad idea? Tests on Debian and Fedora have worked well, so will merge this PR soon.

kaikreuzer commented 6 years ago

I am not aware of reason against it, but @theoweiss is indeed probably the much better person to ask about it :-)

ThomDietrich commented 6 years ago

@BClark09 Instead of the && /bin/true trick we could use systemd-notify/sd_notify. Did you look into them?

BClark09 commented 6 years ago

I couldn't find a way of incorporating it reliably, it seems to suffer from the same race condition.

Also, should we really add platform specific Linux commands to openHAB's start scripts?

ThomDietrich commented 6 years ago

I thought about it and couldn't see a reason why not. Of course one would need to properly check for systemd beforehand.

it seems to suffer from the same race condition

Well nice...

BClark09 commented 6 years ago

Actually, I may have been wrong in that, was looking in the wrong place. The systemd service needed a bit of tweaking to be of "notify" type.

image

Unfortunately, this still isn't a solution, openHAB would need an implementation of sd_notify to pass down that the service has started successfully. Otherwise, the service start command would keep waiting for one and timeout.

ThomDietrich commented 6 years ago

You mean this would need to happen inside openHAB or Karaf, not inside the bash script!?

BClark09 commented 6 years ago

If you're using sd_notify, then you'll need to pass down a message that the service has started successfully, which should only happen inside the java process started by Karaf if you want to certain that everything is working correctly.

BClark09 commented 6 years ago

To summarise:

I'd suggest we merge now and I can make a PR in the distro if using && /bin/true is an acceptable workaround.

theoweiss commented 6 years ago

Sorry guys for not participating in this discussion but remodelling our new house and working for my paid job is still eating my whole time. May be I'm back in late autumn. @BClark09 is doing a great job here!!

BClark09 commented 6 years ago

No problem @theoweiss! Hope it's all going well!