MichaIng / DietPi

Lightweight justice for your single-board computer!
https://dietpi.com/
GNU General Public License v2.0
4.84k stars 496 forks source link

Updated an Odroid C4 to V8.11.2 - Snap package failed and now won't reinstall #5937

Closed MDAR closed 1 year ago

MDAR commented 1 year ago

Creating a bug report/issue

Required Information

Additional Information (if applicable)

This is my next stage in testing, when I have access to a spare Odroid C4.

Steps to reproduce

apt install snapd
snap install core18
snap install core
snap install velbus-tcp

Expected behaviour

Actual behaviour

documented here - https://github.com/velbus/velbus-tcp-snap/issues/11

consistently, I see this error

snap install velbus-tcp
error: cannot perform the following tasks:
Run install hook of "velbus-tcp" snap if present (run hook "install": cannot perform operation: mount --bind /snap/core18/current/etc/nsswitch.conf /tmp/snap.rootfs_h60Z9u/etc/nsswitch.conf: Permission denied)

Extra details

I did at one point think that apparmor was to blame and the best I can see about that is a reference in a 2017 bug report that it has something to do with the /root folder. That report suggests that if the /root folder is a symLink, it will fail. (That is not the case here) A latter comment on that same page in ~2021 states that the issue still exists.

However as everything was perfectly okay until I updated to DietPi 8.11.2 I have to assume I am seeing something new.

MichaIng commented 1 year ago

Many thanks for your report.

/tmp has 1777 mode on nearly every OS, so on DietPi, i.e. every user has write access, but only the owner of a file/dir has permissions to delete it.

Since /tmp/snap.rootfs_* is created by the snap daemon, it is outside of our control and not affected by how DietPi or /tmp it set up. However, it's probably not the directory permissions.

Did you try it with AppArmor disabled/removed, to rule it out?

Since you didn't receive feedback on your issue at Velbus, where/when did they "suggest that it is an OS issue"? What would help is to get some insights into how these directories are set up, what is part of Velbus or core18 and what part snap is entirely responsible for, to get an idea why it could fail. I have neither experience with any of them.

You run an old Odroid C4 image with legacy vendor kernel. This alone might be already the reason, since at least with Docker and Kubernetes we know that pre-v5.x kernel versions fail without applying certain workarounds, at least since Bullseye. As general steps for container host systems, try the following:

MDAR commented 1 year ago

<You run an old Odroid C4 image with legacy vendor kernel. This alone might be already the reason,>

So what I read there was...

"Maybe you can try updating to the latest Bullseye and see if that resolves it."

This buster machine is my own and I haven't touched it for a long time. (If it ain't bust, don't fix it....)

I'll update it and see what happens.

Worse case, I'll put a fresh image on there and restore the configuration files.

Thanks a million

MichaIng commented 1 year ago

This is a good idea in general. systemd.unified_cgroup_hierarchy=0 however will then be required for sure. Follow our guide, to avoid pitfalls: https://dietpi.com/blog/?p=811

MDAR commented 1 year ago

This is a good idea in general. systemd.unified_cgroup_hierarchy=0 however will then be required for sure. Follow our guide, to avoid pitfalls: https://dietpi.com/blog/?p=811

Unfortunately upgrading to Bullseye hasn't solved it.

I'll dig deeper.

On a side not, I had QLCplus running with a dummy HDMI plug, with TigerVNCscraper serving out for remote access

After the upgrade, any VNC client was being rejected, because the VNC port wasn't open

I found a comment in a forum that said that the newer version of TigerScrapper now defaults to Localhost only

https://askubuntu.com/questions/1353178/x0vncserver-unable-to-open-display-0

Adding

-localhost no

Resolved this issue.

I now have everything working again... Other than the Velbus-tcp Snap not loading

MichaIng commented 1 year ago

Note that the kernel hasn't changed. If you have a spare SD card, you could try with mainline kernel image: https://dietpi.com/downloads/images/testing/

This can sadly not be upgraded in place, since the new bootloader requires a different partitioning.

dukeofphilberg commented 1 year ago

Hi,

Member from the Velbus team here. Just wanted to add that we thought it could be an issue between the OS and snapd. We know that snaps are very relient on some features which may not be present on non-Ubuntu systems or not configured like Ubuntu systems (i.e. systemd/apparmor).

MDAR commented 1 year ago

This can sadly not be upgraded in place, since the new bootloader requires a different partitioning.

Right then.. .that may well be the issue.

Looks like I'll be re-compliling QLCplus on a fresh image then.

Everything else is very easy to restore.

My only query is why it worked before the upgrade from V6 to V8, but to be honest, I don't have time to worry about it. It'll be quicker to start again.

MichaIng commented 1 year ago

@pvanloo Thanks for chiming in.

Generally snapd works well on DietPi, we use it for MicroK8s. Also DietPi does not do any deeper system(d) configuration which would affect this. The mentioned, in cases required kernel cmdline arguments are common, depending on kernel defaults and on legacy kernel versions, when the container engine uses cgroups, which we know from Docker and Kubernetes. AppArmor and SELinux userland parts are not installed by default on DietPi. If users install them, then of course one must take care that proper rules or exceptions for services are added, which require it. But that would be exactly the same then on plain Ubuntu and Debian.

However, to further investigate, I'd need more insights into how/where /tmp/snap.rootfs_* is supposed to be created, and how it is, same for /snap/core18/current/etc/nsswitch.conf. Aside of MicroK8s (where it was trivial, working OOTB), I've simply no experience with snapd.

MichaIng commented 1 year ago

It actually works fine here, tested on a DietPi x86_64 VM:

root@VM-Bullseye:~# snap install core18
Warning: /snap/bin was not found in your $PATH. If you've not restarted your session since you
         installed snapd, try doing that. Please see https://forum.snapcraft.io/t/9469 for more
         details.

core18 20221103 from Canonical✓ installed
root@VM-Bullseye:~# snap install core
2022-12-05T20:49:32+01:00 INFO Waiting for automatic snapd restart...
Warning: /snap/bin was not found in your $PATH. If you've not restarted your session since you
         installed snapd, try doing that. Please see https://forum.snapcraft.io/t/9469 for more
         details.

core 16-2.57.6 from Canonical✓ installed
root@VM-Bullseye:~# snap install velbus-tcp
Warning: /snap/bin was not found in your $PATH. If you've not restarted your session since you
         installed snapd, try doing that. Please see https://forum.snapcraft.io/t/9469 for more
         details.

velbus-tcp 1.5.1 from Velbus installed

I also ran the other commands from your script, and the service is up and listening:

root@VM-Bullseye:~# snap connect velbus-tcp:raw-usb :raw-usb
root@VM-Bullseye:~# snap set velbus-tcp serial.port=/dev/serial/by-id/usb-Velleman_Projects_VMB1USB_Velbus_USB_interface-if00 serial.autodiscover=false ntp.enabled=true tcp.host=0.0.0.0,127.0.0.1 tcp.port=27015,6000 tcp.relay=true,true tcp.ssl=true,false tcp.auth=true,false tcp.authkey=velbus,
2022-12-05T20:53:07+01:00 INFO task ignored
root@VM-Bullseye:~# snap enable velbus-tcp
error: cannot enable "velbus-tcp": snap "velbus-tcp" already enabled
root@VM-Bullseye:~# snap get velbus-tcp -d
{
        "logging": {
                "output": "stream",
                "type": "info"
        },
        "ntp": {
                "enabled": true,
                "synctime": ""
        },
        "serial": {
                "autodiscover": false,
                "port": "/dev/serial/by-id/usb-Velleman_Projects_VMB1USB_Velbus_USB_interface-if00"
        },
        "tcp": {
                "auth": "true,false",
                "authkey": "velbus,",
                "cert": "/var/snap/velbus-tcp/common/certificate.pem",
                "host": "0.0.0.0,127.0.0.1",
                "pk": "/var/snap/velbus-tcp/common/privkey.pem",
                "port": "27015,6000",
                "relay": "true,true",
                "ssl": "true,false"
        }
}
root@VM-Bullseye:~# ss -tlpn
State          Recv-Q         Send-Q                 Local Address:Port                  Peer Address:Port        Process
LISTEN         0              0                          127.0.0.1:6000                       0.0.0.0:*            users:(("python3",pid=2342,fd=7))
LISTEN         0              1000                         0.0.0.0:22                         0.0.0.0:*            users:(("dropbear",pid=160,fd=3))
LISTEN         0              0                            0.0.0.0:27015                      0.0.0.0:*            users:(("python3",pid=2342,fd=3))
LISTEN         0              1000                            [::]:22                            [::]:*            users:(("dropbear",pid=160,fd=4))

The only related directory in /tmp is /tmp/snap-private-tmp/snap.velbus-tcp/tmp/ and is empty. /tmp/snap.rootfs_* is only temporarily created on snap startup, to allow mounting stuff inside?

MDAR commented 1 year ago

I've built a few machines from scratch now and none of them have this issue.

I'll go with it being a freak setup when I updated DietPi and without starting fresh with Bullseye

Let's close this and pretend it never happened.

MichaIng commented 1 year ago

Probably there was some kernel feature missing for how recent snap tried to setup and grant access to this tmp dir.