canonical / ubuntu-frame

The foundation for many embedded graphical display implementations
GNU General Public License v3.0
156 stars 21 forks source link

aa_getpeercon() failed for process: Protocol not available #167

Closed Ziris85 closed 4 months ago

Ziris85 commented 7 months ago

Hello! To begin with, I think this issue might be closely related to #77 , though in my case I'm running the latest version of Debian with apparmor in full enforcing mode, and I'm seeing a very similar, though slightly different, error:

root@dashboard:~# snap logs -f ubuntu-frame
2024-02-08T19:27:01-05:00 ubuntu-frame.daemon[85340]: [2024-02-08 19:27:01.047277] <information> frame: aa_getpeercon() failed for process 101766: Protocol not available
2024-02-08T19:27:01-05:00 ubuntu-frame.daemon[85340]: [2024-02-08 19:27:01.047444] <information> frame: aa_getpeercon() failed for process 101766: Protocol not available
2024-02-08T19:27:01-05:00 ubuntu-frame.daemon[85340]: [2024-02-08 19:27:01.047518] <information> frame: aa_getpeercon() failed for process 101766: Protocol not available
2024-02-08T19:27:01-05:00 ubuntu-frame.daemon[85340]: [2024-02-08 19:27:01.047599] <information> frame: aa_getpeercon() failed for process 101766: Protocol not available
2024-02-08T19:27:01-05:00 ubuntu-frame.daemon[85340]: [2024-02-08 19:27:01.047665] <information> frame: aa_getpeercon() failed for process 101766: Protocol not available

root@dashboard:~# aa-enabled 
Yes

root@dashboard:~# aa-status |grep -E ^[0-9]
57 profiles are loaded.
55 profiles are in enforce mode.
2 profiles are in complain mode.
0 profiles are in kill mode.
0 profiles are in unconfined mode.
10 processes have profiles defined.
10 processes are in enforce mode.
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.
0 processes are in mixed mode.
0 processes are in kill mode.

Issue is the same as the other one, however, in that the osk snap isn't working properly. The frame (I'm using wpe-webkit-mir-kiosk here too) is otherwise working fine, as far as I can tell. I also installed using the pretty simple steps for ubuntu-frame and ubuntu-frame-osk. Versions at play on my system:

root@dashboard:~# snap list|grep frame
ubuntu-frame          121-mir2.15.0           7252   22/stable      canonical**  -
ubuntu-frame-osk      49-squeekboard-v1.17.1  372    22/stable      canonical**  -
root@dashboard:~# cat /etc/os-release 
PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Any more info I can provide to help debug this, let me know! Thanks!

Saviq commented 6 months ago

@Ziris85 hey, the Ubuntu AppArmor is quite heavily patched to support snaps. I expect there's something incompatible between libapparmor in the snap and the Debian kernel AppArmor implementation.

Looking into dmesg might shed a light.

Either way, we have the --authorize-without-apparmor option for this case:

https://mir-server.io/docs/ubuntu-frame-configuration-options#heading--snap-configuration-config

AlanGriffiths commented 6 months ago

Issue is the same as the other one, however, in that the osk snap isn't working properly

This does sound like it is differences in AppAmor between Ubuntu and Debian. We would need to understand what has changed and likely change the Frame snap to fix.

There are, however, a couple of things you can try as a workaround before that happens:

  1. Add authorise-without-apparmor=true to the Frame configuration. I'm not sure this will work with "Protocol not available", but worth a try. Or, if this doesn't work:
  2. Add add-wayland-extensions=zwlr_layer_shell_v1:zwp_virtual_keyboard_manager_v1:zwp_text_input_manager_v3 to the Frame configuration.

More details on configuring Frame here, or the TL;DR:

snap set ubuntu-frame config "
authorise-without-apparmor=true
"
Ziris85 commented 6 months ago

Hey there @Saviq @AlanGriffiths , thanks for getting back with me! I tried the authorise-without-apparmor=true, and that had no effect on either the log output complaints or the behavior. I next tried the add-wayland-extensions=zwlr_layer_shell_v1:zwp_virtual_keyboard_manager_v1:zwp_text_input_manager_v3 config, and this seems like it's on the right track? The OSK service starts now, but it's not working as intended - all that happens is a persistent, empty gradient box on the display (no keyboard), and clicking into text fields doesn't produce a keyboard. if I snap restart ubuntu-frame-osk, it toggles between the empty gradient box and nothing displayed at all heh. I am, however, getting a new warning showing up that wasn't before, each time I restart the OSK, in addition to the flurry of apparmor complaints:

2024-02-09T19:41:37-05:00 ubuntu-frame.daemon[11048]: [2024-02-09 19:41:37.920820] <information> frame: aa_getpeercon() failed for process 14206: Protocol not available
2024-02-09T19:41:38-05:00 ubuntu-frame.daemon[11048]: [2024-02-09 19:41:38.133358] < -warning- > mirserver: Ignoring layer shell protocol violation: wl_surface destroyed before associated zwlr_layer_surface_v1@26
2024-02-09T19:41:38-05:00 ubuntu-frame.daemon[11048]: [2024-02-09 19:41:38.257292] <information> frame: aa_getpeercon() failed for process 11258: Protocol not available

I'm not sure if this is related to the OSK still not showing up or not?

AlanGriffiths commented 6 months ago

@Ziris85 thanks for the update.

Part of the problem you describe now ("all that happens is a persistent, empty gradient box on the display (no keyboard)" sounds similar to https://github.com/canonical/ubuntu-frame-osk/issues/58. Could you try the version of Ubuntu Frame that's now in testing?

snap refresh ubuntu-frame --candidate

(I think the fix for https://github.com/canonical/ubuntu-frame-osk/issues/58 is already in 2.15, but want to be sure.)

The other part of the problem "clicking into text fields doesn't produce a keyboard" could be a problem with the client app (that needs to request the keyboard when you click on the text field). What app are you testing with?

zyga commented 5 months ago

I've reproduced this and looking from snapd perspective.

zyga commented 5 months ago

With the help of strace I got this:

9500  socketpair(AF_UNIX, SOCK_STREAM, 0, [25, 26]) = 0
9500  getsockopt(25, SOL_SOCKET, SO_PEERSEC, 0x55d07b029af0, [128]) = -1 ENOPROT

I had a look at what the kernel in Debian bookworm (11) says. At the time of this writing this is 6.1.0-17:

This toy C program:

#include <stdio.h>
#include <stdlib.h>

#include <sys/socket.h>

int main(int argc, char **argv) {
  int sv[2];
  if (socketpair(AF_UNIX, SOCK_STREAM, 0, sv) < 0) {
    perror("socketpair");
  };

  char sec_buf[128];
  socklen_t sec_buf_len = sizeof(sec_buf);

  if (getsockopt(sv[0], SOL_SOCKET, SO_PEERSEC, sec_buf, &sec_buf_len) < 0) {
    perror("getsockopt SOL_SOCKET SO_PEERSEC");
  }

  return 0;
}

Fails as follows:

$ ./a.out 
getsockopt SOL_SOCKET SO_PEERSEC: Protocol not available

Similarly strace shows:

socketpair(AF_UNIX, SOCK_STREAM, 0, [3, 4]) = 0
getsockopt(3, SOL_SOCKET, SO_PEERSEC, 0x7ffe8a2c7b60, [128]) = -1 ENOPROTOOPT (Protocol not available)

As such I don't believe this particular failure is related to snap sandbox. It seems that it is just not implemented in that particular kernel, yet. As a quick sanity check, the program works fine on Fedora 39 with 6.7.7.

zyga commented 5 months ago

Looking at the kernel some more, in 6.1 this is the implementation:

int security_socket_getpeersec_stream(struct socket *sock, char __user *optval,
              int __user *optlen, unsigned len)
{
  return call_int_hook(socket_getpeersec_stream, -ENOPROTOOPT, sock,
        optval, optlen, len);
}

So I suspect that unless there's an LSM hook present, the error is really ENOPROTOOPT. Given the extensive patches for AF_UNIX security features that are in Ubuntu kernel, I think this is really the cause.

EDIT: correction, I think there's something I'm missing here, the hook is present in 6.1:

static int apparmor_socket_getpeersec_stream(struct socket *sock,
               char __user *optval,
               int __user *optlen,
               unsigned int len)
{
  char *name;
  int slen, error = 0;
  struct aa_label *label;
  struct aa_label *peer;

  label = begin_current_label_crit_section();
  peer = sk_peer_label(sock->sk);
  if (IS_ERR(peer)) {
    error = PTR_ERR(peer);
    goto done;
  }
  slen = aa_label_asxprint(&name, labels_ns(label), peer,
         FLAG_SHOW_MODE | FLAG_VIEW_SUBNS |
         FLAG_HIDDEN_UNCONFINED, GFP_KERNEL);
  /* don't include terminating \0 in slen, it breaks some apps */
  if (slen < 0) {
    error = -ENOMEM;
  } else {
    if (slen > len) {
      error = -ERANGE;
    } else if (copy_to_user(optval, name, slen)) {
      error = -EFAULT;
      goto out;
    }
    if (put_user(slen, optlen))
      error = -EFAULT;
out:
    kfree(name);

  }

done:
  end_current_label_crit_section(label);

  return error;
}

The helper is:

static struct aa_label *sk_peer_label(struct sock *sk)
{
  struct aa_sk_ctx *ctx = SK_CTX(sk);

  if (ctx->peer)
    return ctx->peer;

  return ERR_PTR(-ENOPROTOOPT);
}
zyga commented 5 months ago

Still digging some more, I switched the snap.ubuntu-frame.ubuntu-frame profile to complain mode and ran some more tests to avoid the case of unconfined processes observing something else:

$ aa-exec -p snap.ubuntu-frame.ubuntu-frame cat /proc/self/attr/apparmor/current 
snap.ubuntu-frame.ubuntu-frame (complain)

$ aa-exec -p snap.ubuntu-frame.ubuntu-frame  ./a.out 
getsockopt SOL_SOCKET SO_PEERSEC: Protocol not available
zyga commented 5 months ago

I've suspended debugging as there's something more important I'm looking at. I cannot yet say if this is a kernel bug in Debian or just something that ought to be handled in the app.

AlanGriffiths commented 5 months ago

From a practical perspective, we can encounter this ENOPROTOOPT error on non-Ubuntu distros. We should probably treat it in the same way as EINVAL and enable use of the authorise-without-apparmor option.

AlanGriffiths commented 5 months ago

@Ziris85, there's a build with a potential workaround that could try with:

snap install --channel=edge/pr175 ubuntu-frame
snap set ubuntu-frame config "
authorise-without-apparmor=true
"

(Or refresh if already installed)

It will be sometime next week before I have time to test for myself.

AlanGriffiths commented 4 months ago

Hmm, what's the easiest way to test this? I have an LXC instance of debian/buster, but that doesn't replicate the problem (as I might have expected if I thought about it).

Saviq commented 4 months ago

Hmm, what's the easiest way to test this? I have an LXC instance of debian/buster, but that doesn't replicate the problem (as I might have expected if I thought about it).

--vm to lxc launch should do?

AlanGriffiths commented 4 months ago

--vm to lxc launch should do?

Sounds plausible, but that doesn't reproduce either.

AlanGriffiths commented 4 months ago

OK, reproduced on bare metal. But the fix doesn't seem to work as expected. More tomorrow!

AlanGriffiths commented 4 months ago

OK, reproduced on bare metal. But the fix doesn't seem to work as expected. More tomorrow!

Testing the right thing does work though. :tada:

tbanetwork commented 4 months ago

Hi!

@AlanGriffiths im not 100% sure what the fix to the issue is now. Did you find one and could you guide me to the correct settings?

Thanks tbanetwork

AlanGriffiths commented 4 months ago

@tbanetwork sorry if it wasn't clear: adding authorise-without-apparmor=true to the config option (as described above) now works with Debian's version of AppArmor.

tbanetwork commented 4 months ago

@AlanGriffiths thanks for getting back to me and pointing me in the right direction.