TritonDataCenter / illumos-joyent

Community developed and maintained version of the OS/Net consolidation
http://www.illumos.org/projects/illumos-gate
265 stars 111 forks source link

OS-8543 Want support for OFD locks in lx-brand #467

Closed smokris closed 1 month ago

smokris commented 2 months ago

Problem

On Debian 12 with backports enabled (or Debian Trixie or Debian Sid) running in SmartOS LX, systemd-sysusers fails after updating systemd to v254+, leading to failures in apt upgrade.

Steps to reproduce

Analysis

In systemd v253 and earlier, systemd's file-locking code would try to obtain an Open File Description (OFD) lock via fcntl(…F_OFD_SETLKW…), and when that failed it would fall back to flock(…F_SETLKW…). However, systemd v254 dropped the fallback, so it now requires support for OFD locks.

SmartOS LX does not support OFD locks, so systemd-sysusers v254 fails lock /etc/passwd, which in turn causes Debian apt upgrade to fail.

Proposed change

Update SmartOS LX's fcntl wrapper to use SmartOS's native support for OFD locks.

I have locally built SmartOS with this change, and confirmed that this change enables systemd v254+'s systemd-sysusers to run successfully, and thus enables apt upgrade to complete.

Before:

# systemd-sysusers
Failed to take /etc/passwd lock: Invalid argument
# echo $?
1
# strace systemd-sysusers
…
fcntl(3, F_OFD_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = -1 EINVAL (Invalid argument)
…

After:

# systemd-sysusers
# echo $?
0
# strace systemd-sysusers
…
fcntl(3, F_OFD_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
…
bahamat commented 2 months ago

Filed internally as OS-8543.

danmcd commented 1 month ago

This could be one of those cases where we lie and just work, but given the description in glibc, I wonder if we need to keep some brand state about this or something.

Apologies for not looking more deeply. We actually have some of this already in the base OS (aka OS-2868 upstreamed to -gate as illumos#3252), with some caveats. Here's text from the illumos bug:

As part of this Keith Wesolowski originally did a survey of the various differences
between POSIX locks set through fcntl, lockf, flock, and the newer GNU OFD
(open file descriptor) locks which are trying to tackle and more broadly standardize
the notion of locks being set upon an open file descriptor. Based on this, he
determined that a flock is basically a specialized type of OFD, hence implementation
of similar APIs here.

So maybe we're Good Enough in the illumos#3252 support?

danmcd commented 1 month ago

Additionally @smokris ==> Please inspect $UTS/common/syscall/fcntl.c to make sure the differences in LX's fcntl() vs. native's aren't further apart than Linux apps expect them to be.

danmcd commented 1 month ago

Per an offline conversation with @smokris , I'm going to take over guiding this PR in. Depending on how cooperative github is, I may need to close this and open a new one. We'll see.

danmcd commented 1 month ago

Closing PR in lieu of #471