openhab / openhab-linuxpkg

Repo for Linux packages
Eclipse Public License 2.0
18 stars 33 forks source link

Fix sysVinit #26

Closed theoweiss closed 7 years ago

theoweiss commented 7 years ago

Hopefully fixes #25 @BClark09 Could you do some testing. I've tested it on debian 7 and it looks not to bad so far. There are some ugly workarounds and we have to fix previous installations, which means we have to rely on postinst.

BClark09 commented 7 years ago

Thanks for finding the problem @theoweiss, openHAB runs on upgrade, but if openHAB's java process is killed then the karaf server gets stuck. I wasn't able to logon to the openHAB console until I:

ps aux -u "$OH_USER" --sort=start_time 2>/dev/null | grep openhab.*karaf | grep -v grep | awk '{print $2}' | tail -1

Then killed the process with (-9 necessary):

sudo kill -9 $karaf_pid
BClark09 commented 7 years ago

I've just tested adding the following function:

findkarafpid() {
  OH_USER=openhab
  if [ x"${USER_AND_GROUP}" != x ]; then
    OH_USER=`echo ${USER_AND_GROUP} | cut -d ":" -f 1`
  fi
  ps aux -u "$OH_USER" --sort=start_time 2>/dev/null | grep openhab.*karaf | grep -v grep | awk '{print $2}' | tail -1
}

Then changing kill loop to:

karafpid=`findkarafpid`
if [ $timeout -eq 20 ]; then
   # finally kill the process if timeout is reached
   echo "killing the openHAB service with pid $pid"
   kill -9 $pid
   kill -9 $karafpid
   break
fi
theoweiss commented 7 years ago

Killing two pids its not obvious to me why. Make sure to do a gradlew clean for each build or remove build/distribution/*.deb. I've noticed that the build dependencies are incomplete. Changes to control files do not trigger a new build of the deb.

BClark09 commented 7 years ago

systemd works well because it registers two PIDs with the service and shuts them down appropriately, one for java owned by openHAB, and the other for Karaf server owned by openHAB.

Main PID: 26235 (karaf)
   CGroup: /system.slice/openhab2.service
           26235 /bin/bash /usr/share/openhab2/runtime/bin/karaf server
           26374 /usr/bin/java -Dopenhab.home=/usr/share/openhab2 -Dopenhab [...]

The init.d scripts aren't aware of this second process, since it has essentially forked after it has assumed a successful start. If you force kill the java app, there's a chance that the Karaf server gets stuck. So when you kill one, you should also kill the other.

I'll check again when I get home, making sure I clean the previous builds first.

BClark09 commented 7 years ago

Also, I realise that both processes contain the openhab.*karaf pattern, so it should be sufficient to:

for pid in $(ps aux -u "$OH_USER" 2>/dev/null | grep openhab.*karaf | grep -v grep | awk '{print $2}'); do kill -9 $pid; done
BClark09 commented 7 years ago

Hi @theoweiss, my apologies. I built a new Ubuntu 14 virtual machine, installed RC1 and then 2.0.0 and have no longer experienced the problem above on 5 attempts, the problem must have been local and I think we're good to merge, sorry for the delay!

theoweiss commented 7 years ago

Very good. I'm back from travelling. I will merge this PR and then do the bintray release.