openhab / openhab-linuxpkg

Repo for Linux packages
Eclipse Public License 2.0
18 stars 33 forks source link

Reinstall fails due to existing user #108

Closed ThomDietrich closed 4 years ago

ThomDietrich commented 6 years ago

The above is just my first theory. I've executed sudo apt purge openhab2. An error was shown:

Purging configuration files for openhab2 (2.2.0-1) ...
userdel: user openhab is currently used by process 597
/usr/sbin/deluser: `/usr/sbin/userdel openhab' returned error code 8. Exiting.
/usr/sbin/delgroup: `openhab' still has `openhab' as their primary group!

Next I reinstalled openhab2 but the service was not able to start:

Jan 16 11:19:01 openHABianPiFuchsbau systemd[1]: Started openHAB 2 - empowering the smart home.
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: !SESSION 2018-01-16 11:19:04.204 -----------------------------------------------
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: eclipse.buildId=unknown
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: java.version=1.8.0_152
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: java.vendor=Azul Systems, Inc.
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: BootLoader constants: OS=linux, ARCH=arm, WS=gtk, NL=en_US
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: !ENTRY org.eclipse.osgi 4 0 2018-01-16 11:19:04.205
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: !MESSAGE Error reading configuration: Unable to create lock manager.
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: !STACK 0
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: java.io.IOException: Unable to create lock manager.
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.storagemanager.StorageManager.open(StorageManager.java:698)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.storage.Storage.getChildStorageManager(Storage.java:1792)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.storage.Storage.getInfoInputStream(Storage.java:1809)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.storage.Storage.<init>(Storage.java:129)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.storage.Storage.createStorage(Storage.java:88)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.internal.framework.EquinoxContainer.<init>(EquinoxContainer.java:66)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.launch.Equinox.<init>(Equinox.java:31)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.eclipse.osgi.launch.EquinoxFactory.newFramework(EquinoxFactory.java:24)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.apache.karaf.main.Main.launch(Main.java:256)
Jan 16 11:19:04 openHABianPiFuchsbau karaf[23673]: #011at org.apache.karaf.main.Main.main(Main.java:179)
Jan 16 11:19:04 openHABianPiFuchsbau systemd[1]: openhab2.service: Main process exited, code=exited, status=255/n/a
Jan 16 11:19:06 openHABianPiFuchsbau karaf[23806]: /var/lib/openhab2/tmp/port shutdown port file doesn't exist. The container is not running.
Jan 16 11:19:06 openHABianPiFuchsbau systemd[1]: openhab2.service: Control process exited, code=exited status=3
Jan 16 11:19:06 openHABianPiFuchsbau systemd[1]: Stopped openHAB 2 - empowering the smart home.
Jan 16 11:19:06 openHABianPiFuchsbau systemd[1]: openhab2.service: Unit entered failed state.
Jan 16 11:19:06 openHABianPiFuchsbau systemd[1]: openhab2.service: Failed with result 'exit-code'.

The problem was solved by another purge followed by a manual removal of the user (after stopping the other service under the user), followed by a successful reinstall.

Related to https://github.com/openhab/openhabian/issues/285

BClark09 commented 6 years ago

Interesting, would you be able to know which process was 597? I'm assuming it wasn't openHAB as this process is killed if it can't shutdown safely during uninstall?

It might make sense to stop any the processes owned by openhab during a purge?

ThomDietrich commented 6 years ago

It's the frontail log viewer. In openHABian this service uses the openhab user. I can now see that this decision was a mistake... Still the error shouldn't occur. No, I don't think we can expect the user to only be used for openhab itself. It might be better to deal with it. I.e.:

As for step two: Why did it fail actually? I wasn't able to figure that out.

BClark09 commented 6 years ago

Yeah, definitely shouldn't fail on reinstall in this case. Will have a play to replicate and try to fix. Thanks!

BClark09 commented 6 years ago

Hey @ThomDietrich, do you think it would be a good idea to kill all processes belonging to openhab?

pkill -U openhab
userdel openhab

will return no errors.

ThomDietrich commented 6 years ago

Hey Ben, I'm not sure where you are headed. Didn't we agree to not force-delete the user?

BClark09 commented 6 years ago

Sorry, I may have forgotten what we had discussed. And came at it from the same (incorrect) angle. 😖

No, I don't think we can expect the user to only be used for openhab itself

Yes, I agree (like I did last time...) 😄

BClark09 commented 6 years ago

I'm currently trying to work out why the reinstall completed partially. The details about creating the user is in the preinst stage, and each of these files have the -e flag set. It looks like they already skip adding it if the user already exists.

ThomDietrich commented 6 years ago

Maybe you could check if the user exists prior to creating it? Also I'd suggest to add warning messages on both sides:

User openhab already exists on the system.

User openhab couldn't be deleted. You may want to do that manually...

ThomDietrich commented 6 years ago

Hey @BClark09 I'm currently going through all my openHABian issues and found the one related to this issue. What would you suggest to do here?

Best! Hope you are doing fine ;)

BClark09 commented 6 years ago

Hey @ThomDietrich! Hope you are well too!

I've not had a good amount of time to have a look at this properly yet. I'm still a bit confused at why openHAB fails to reinstall properly. I couldn't replicate the exact error above, but I did run into problems. The preinst script already checks for an existing user.

My suggestion so far would be to make a simple check (and as you suggested, warn) for PIDs belonging to the user/group to see if that has any effect on the reinstall. The postrm script will become something like:

  purge)
    removeOpenHABInit
    if [ -z "$(ps -o pid= -u openhab)" ]; then
      if getent passwd $OH_USER > /dev/null 2>&1; then
        deluser --system --quiet $OH_USER || true
      fi
      if getent group $OH_GROUP > /dev/null 2>&1; then
        delgroup --system --quiet $OH_GROUP || true
      fi
    else 
      echo "The 'openhab' user has other active processes and cannot be deleted." 
      echo "  You may want to delete the user and group manually."
    fi
    rm -rf /var/log/openhab2
    rm -rf /var/lib/openhab2
    rm -rf /usr/share/openhab2
    rm -rf /etc/openhab2
    exit 0
    ;;
mstormi commented 4 years ago

is this still an issue after all ?

It's also linked to https://github.com/openhab/openhabian/issues/285

BClark09 commented 4 years ago

Finally did something about this. So sorry for the delay 😞!