Closed mstormi closed 4 years ago
It won't help, as the image RasberryPi provides is still based off of the old 4.19 kernel so it will still upgrade to 5.4 when installing.
Essentially, we need a way to tell the system to reboot on kernel upgrade and then continue. A potential short term patch until RasberryPi releases a new image is to always reboot after the following lines in first-boot.bash
:
echo -n "$(timestamp) [openHABian] Updating repositories and upgrading installed packages... "
apt-get install --fix-broken --yes &> /dev/null
if [[ $(eval "$(apt-get --yes upgrade &> /dev/null)") -eq 100 ]]; then
echo -n "CONTINUING... "
dpkg --configure --pending &> /dev/null
apt-get install --fix-broken --yes &> /dev/null
if apt-get upgrade --yes &> /dev/null; then echo "OK"; else echo "FAILED"; fi
else
echo "OK"
fi
ok could you put that into master? that's broken anyway ATM. Need to leave now but will be back in ~2 hrs. I've just prepared a pre-release we can direct people at when they have problems. https://github.com/openhab/openhabian/releases/tag/v1.6-alpha
ok could you put that into master?
Done.
Also we should probably sync master and stable for the image release when we finish testing, that way it is the same across the board.
Yes. I'd think stable is currently affected as well.
And that way they are both in sync for the image release and people won't be confused by why it is a different git hash when we trigger the build from stable
Trouble is, first-boot.bash to reboot for kernel upgrade to work is run off /boot which is the image-contained version not the current one. So we need to build another image for this to work, right ?
Yes, but as I was saying that is why the image should be built off of a synced stable and master branch
I'm right now syncing stable to master
strange, it's building an oldish stable.... I'll never get to fully understand git. Always good for an adrenaline kick, that is. Deleted, re-created stable from master and pushed back, now it's the up to date version (hopefully :))
Ouch we have a problem. The box keeps rebooting over and over. Any spontaneous idea ? Just wanted to go to bed ;(
I've quickly put up #1066/#1067 but haven't been able to fully test reaaly need to get some sleep now
... and I wonder what a reboot is doing to CI ...
No, idea. CI, appears to be handling it without errors though. Your solution seems fine to me, however I don't have time to test today.
Damn it, there's reports of the image failing to complete installation / start the dashboard and it seems to be true, oh2 fails to run no idea so far why on my test box journalctl -xu openhab2 says
Jul 30 02:53:02 openhab karaf[11000]: org.osgi.framework.BundleException: Unable to acquire the state change lock for the module: osgi.identity; osgi.identity="org.eclipse.osgi"; ty
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.container.Module.lockStateChange(Module.java:337)
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle$EquinoxSystemModule.asyncStop(EquinoxBundle.java:156)
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle.stop(EquinoxBundle.java:262)
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.internal.framework.EquinoxBundle$SystemBundle.stop(EquinoxBundle.java:267)
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.launch.Equinox.stop(Equinox.java:123)
Jul 30 02:53:02 openhab karaf[11000]: at org.apache.karaf.main.Main$2.run(Main.java:354)
Jul 30 02:53:02 openhab karaf[11000]: Caused by: java.util.concurrent.TimeoutException: Timeout after waiting 5 seconds to acquire the lock.
Jul 30 02:53:02 openhab karaf[11000]: at org.eclipse.osgi.container.Module.lockStateChange(Module.java:334)
Jul 30 02:53:02 openhab karaf[11000]: ... 5 more
Jul 30 02:53:10 openhab systemd[1]: openhab2.service: Succeeded.
ZRAM down, too. But both just needed a systemctl start ...
I'll check if the just-merged additional reboot will fix that
is #1069 a duplicate? will therefore #1070 fix this?
IDK, I think we should merge #1039, #1070, and #1072 ASAP and finalize the image as I believe those contain all of the remaining breaking changes to the final image.
After testbuild install: why is the new ZRAM part missing?
[14:34:19] root@openhab:/opt/openhabian/functions# cat /etc/systemd/system/openhab2.service.d/override.conf
[Service]
ExecStartPre=-/bin/bash -c '/usr/bin/find ${OPENHAB_CONF} -name "*.rules" -exec /usr/bin/rename.ul .rules .x {} \\;'
ExecStartPost=-/bin/sleep 120
ExecStartPost=-/bin/bash -c '/usr/bin/find ${OPENHAB_CONF} -name "*.x" -exec /usr/bin/rename.ul .x .rules {} \\;'
TimeoutStartSec=240
Here's the install log. I just don't spot the reason right away, probably cause I can't wait to get hold of a cold beer (having my birthday party ithis evening). So if you do please fix.
+ delayed_rules yes
+ openhab_is_installed
+ dpkg -s openhab2
+ return 0
+ local targetDir
+ targetDir=/etc/systemd/system/openhab2.service.d
+ [[ yes == \y\e\s ]]
++ timestamp
++ date +%F_%T_%Z
+ echo -n '2020-08-02_15:03:06_CEST [openHABian] Adding delay on loading openHAB rules... '
2020-08-02_15:03:06_CEST [openHABian] Adding delay on loading openHAB rules... + cond_redirect mkdir -p /etc/systemd/system/openhab2.service.d
+ [[ -n '' ]]
+ echo -e '\n\033[90;01m$ mkdir -p /etc/systemd/system/openhab2.service.d \033[39;49;00m'
$ mkdir -p /etc/systemd/system/openhab2.service.d
+ mkdir -p /etc/systemd/system/openhab2.service.d
+ return 0
+ cond_redirect rm -f /etc/systemd/system/openhab2.service.d/override.conf
+ [[ -n '' ]]
+ echo -e '\n\033[90;01m$ rm -f /etc/systemd/system/openhab2.service.d/override.conf \033[39;49;00m'
$ rm -f /etc/systemd/system/openhab2.service.d/override.conf
+ rm -f /etc/systemd/system/openhab2.service.d/override.conf
+ return 0
+ cond_redirect cp /opt/openhabian/includes/systemd-override.conf /etc/systemd/system/openhab2.service.d/override.conf
+ [[ -n '' ]]
+ echo -e '\n\033[90;01m$ cp /opt/openhabian/includes/systemd-override.conf /etc/systemd/system/openhab2.service.d/override.conf \033[39;49;00m'
$ cp /opt/openhabian/includes/systemd-override.conf /etc/systemd/system/openhab2.service.d/override.conf
+ cp /opt/openhabian/includes/systemd-override.conf /etc/systemd/system/openhab2.service.d/override.conf
+ return 0
+ echo OK
OK
+ cond_redirect systemctl -q daemon-reload
+ cond_redirect systemctl restart openhab2.service
+ [[ -n '' ]]
+ echo -e '\n\033[90;01m$ systemctl restart openhab2.service \033[39;49;00m'
$ systemctl restart openhab2.service
+ systemctl restart openhab2.service
+ return 0
+ dashboard_add_tile openhabiandocs
Yes, that is because I only included the FIND3 branch changes not the others, sorry for the confusion.
tested build #207 (to include ZRAM patches) The service files look fine but ZRAM fails to start. Or to be more precise according to journalctl -xu zram-config, it is started and stopped again. If manually started again it's fine but why is it stopped during unattended installation ?
Not sure. Is this the result after the final reboot during install? It shouldn't be being stopped it is one of the last things installed and nothing after it calls for a reinstall.
test with latest build was ok
given there were no more unexplained problems on latest build (except #1078), I'm gonna close this now. Reopen in case of new or recurring issues.
In the first place I thought it was a problem with mosquitto that didn't run after a reboot Debugging, I came to think it is ZRAM related because that fails to install
But I believe the problem is larger: Apparently modprobe zram fails because the kernel modules (5.4.51) do not match the kernel (4.19.57). The kernel that's in the old image is too old!
That's why installing zram fails unless we reboot. But that results in a need to install zram manually.
I believe we should hurry up with the new image as it probably will not happen there.