SUPERCILEX / clipboard-history

Ringboard—the clipboard manager for Linux
Apache License 2.0
171 stars 6 forks source link

ringboard-server fails to start: "Failed to open pressure file" #28

Closed vi closed 2 months ago

vi commented 2 months ago
$ target/myrelease/ringboard-server --help
Error: an I/O error occurred
│
╰─▶ No such file or directory (os error 2)
    ╰╴Failed to open pressure file: "/sys/fs/cgroup/\n13:rdma:/\n12:pids:/\n11:hugetlb:/\n10:net_prio:/\n9:perf_event:/\n8:net_cls:/\n7:freezer:/\n6:devices:/\n5:memory:/\n4:blkio:/\n3:cpuacct:/\n2:cpu:/\n1:cpuset:/memory.pressure"

Does it expect cgroup v2 instead of v1?

Is there a keep-it-simple mode without memory pressure monitoring? What if somebody is running a Linux kernel without cgroup support at all?

SUPERCILEX commented 2 months ago

I don't see how you'd be able to disable cgroups. What's the output of cat /proc/self/cgroup?

SUPERCILEX commented 2 months ago

Oh actually I think I have a fix.

vi commented 2 months ago
$ cat /proc/self/cgroup
14:misc:/
13:rdma:/
12:pids:/
11:hugetlb:/
10:net_prio:/
9:perf_event:/
8:net_cls:/
7:freezer:/
6:devices:/
5:memory:/
4:blkio:/
3:cpuacct:/
2:cpu:/
1:cpuset:/
SUPERCILEX commented 2 months ago

You're on cgroup v1. We could support that eventually but eh. For now let's just not crash.

AguirreIF commented 2 months ago

Hi, sorry for posting to a closed issue, but I'm having the same problem:

$ ringboard-server --help
Error: an I/O error occurred
│
╰─▶ Permission denied (os error 13)
    ╰╴Failed to open pressure file: /sys/fs/cgroup/user.slice/user-1000.slice/session-1.scope/memory.pressure

Contents of /proc/self/cgroup:

$ cat /proc/self/cgroup
0::/user.slice/user-1000.slice/session-1.scope

I followed the installation instructions.

Thanks!

vi commented 2 months ago

Indeed, I expected it to only warn on any errors regarding memory pressure monitoring, but the commit only changed things about parsing content of a file.

SUPERCILEX commented 2 months ago

I wonder if you're in hybrid mode. Can you run grep cgroup2 /proc/mounts and stat -f /sys/fs/cgroup/unified and see what those output?

SUPERCILEX commented 2 months ago

Also can you try installing dbus-user-session with your package manager and then rebooting to see if it's fixed? Based on this thread it seems like you have one global cgroup instead of one per process.

SUPERCILEX commented 2 months ago

Actually nevermind, I think I need to make a slice for myself.

SUPERCILEX commented 2 months ago

Ok, can you run these commands (assuming you use systemd?):

curl -s https://raw.githubusercontent.com/SUPERCILEX/clipboard-history/master/ringboard.slice --create-dirs -O --output-dir ~/.config/systemd/user/
curl -s https://raw.githubusercontent.com/SUPERCILEX/clipboard-history/master/server/ringboard-server.service --create-dirs -O --output-dir ~/.config/systemd/user/
sed -i "s|ExecStart=ringboard-server|ExecStart=$(which ringboard-server)|g" ~/.config/systemd/user/ringboard-server.service
curl -s https://raw.githubusercontent.com/SUPERCILEX/clipboard-history/master/$XDG_SESSION_TYPE/ringboard-$XDG_SESSION_TYPE.service -O --output-dir ~/.config/systemd/user/
sed -i "s|ExecStart=ringboard-$XDG_SESSION_TYPE|ExecStart=$(which ringboard-$XDG_SESSION_TYPE)|g" ~/.config/systemd/user/ringboard-$XDG_SESSION_TYPE.service
systemctl --user daemon-reload
systemctl --user restart ringboard-server

If that doesn't work, try modifying ~/.config/systemd/user/ringboard.slice to include

[Slice]
Delegate=true

and then

systemctl --user daemon-reload
systemctl --user restart ringboard-server

to see if it works.

AguirreIF commented 2 months ago

I wonder if you're in hybrid mode. Can you run grep cgroup2 /proc/mounts and stat -f /sys/fs/cgroup/unified and see what those output?

Yes, here they are:

$ grep cgroup2 /proc/mounts
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0

There is no file /sys/fs/cgroup/unified, this is the content of the directory /sys/fs/cgroup/:

$ ls -l /sys/fs/cgroup/
total 0
-r--r--r--  1 root root 0 21 ago/24 11:48 cgroup.controllers
-rw-r--r--  1 root root 0 21 ago/24 15:28 cgroup.max.depth
-rw-r--r--  1 root root 0 21 ago/24 15:28 cgroup.max.descendants
-rw-r--r--  1 root root 0 21 ago/24 15:28 cgroup.pressure
-rw-r--r--  1 root root 0 21 ago/24 11:48 cgroup.procs
-r--r--r--  1 root root 0 21 ago/24 15:28 cgroup.stat
-rw-r--r--  1 root root 0 21 ago/24 11:48 cgroup.subtree_control
-rw-r--r--  1 root root 0 21 ago/24 15:28 cgroup.threads
-rw-r--r--  1 root root 0 21 ago/24 15:28 cpu.pressure
-r--r--r--  1 root root 0 21 ago/24 15:28 cpuset.cpus.effective
-r--r--r--  1 root root 0 21 ago/24 15:28 cpuset.mems.effective
-r--r--r--  1 root root 0 21 ago/24 15:28 cpu.stat
-r--r--r--  1 root root 0 21 ago/24 15:28 cpu.stat.local
drwxr-xr-x  2 root root 0 21 ago/24 11:48 dev-hugepages.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 dev-mqueue.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 init.scope
-rw-r--r--  1 root root 0 21 ago/24 15:28 io.cost.model
-rw-r--r--  1 root root 0 21 ago/24 15:28 io.cost.qos
-rw-r--r--  1 root root 0 21 ago/24 15:28 io.pressure
-r--r--r--  1 root root 0 21 ago/24 15:28 io.stat
-r--r--r--  1 root root 0 21 ago/24 15:28 memory.numa_stat
-rw-r--r--  1 root root 0 21 ago/24 15:28 memory.pressure
--w-------  1 root root 0 21 ago/24 15:28 memory.reclaim
-r--r--r--  1 root root 0 21 ago/24 15:28 memory.stat
-r--r--r--  1 root root 0 21 ago/24 15:28 misc.capacity
-r--r--r--  1 root root 0 21 ago/24 15:28 misc.current
drwxr-xr-x  2 root root 0 21 ago/24 11:48 proc-sys-fs-binfmt_misc.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 sys-fs-fuse-connections.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 sys-kernel-config.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 sys-kernel-debug.mount
drwxr-xr-x  2 root root 0 21 ago/24 11:48 sys-kernel-tracing.mount
drwxr-xr-x 32 root root 0 21 ago/24 15:27 system.slice
drwxr-xr-x  3 root root 0 21 ago/24 11:48 user.slice

Thanks!

SUPERCILEX commented 2 months ago

Ok, you have the same config as me so I think the slice thing should fix it.

Oh and to see if the server started properly run systemctl --user status ringboard-server and check logs with

sudo journalctl `which ringboard-server`
vi commented 2 months ago

Even with the recent fix, I think that cgroups should not a dependency of Ringboard and any failure to open the pressure file should be handled gracefully.

AguirreIF commented 2 months ago

Ok, you have the same config as me so I think the slice thing should fix it.

Oh and to see if the server started properly run systemctl --user status ringboard-server and check logs with

sudo journalctl `which ringboard-server`

Thanks for the quick reply, but it did't work, here is the output of systemctl --user status ringboard-server:

$ systemctl --user status ringboard-server
× ringboard-server.service - Ringboard server
     Loaded: loaded (/home/user/.config/systemd/user/ringboard-server.service; static)
     Active: failed (Result: exit-code) since Wed 2024-08-21 15:39:48 -03; 3min 32s ago
 Invocation: 37c6930a5b0a4ceeb75e04bc6e8fd0f6
       Docs: https://github.com/SUPERCILEX/clipboard-history
    Process: 13539 ExecStart=ringboard-server (code=exited, status=203/EXEC)
   Main PID: 13539 (code=exited, status=203/EXEC)

ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Scheduled restart job, restart counter is at 5.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Start request repeated too quickly.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Failed with result 'exit-code'.
ago 21 15:39:48 who systemd[1560]: Failed to start ringboard-server.service - Ringboard server.

And the output of journalctl --user -xeu ringboard-server.service:

$ journalctl --user -xeu ringboard-server.service
The error number returned by this process is ERRNO.
ago 21 15:39:48 who (d-server)[13539]: ringboard-server.service: Failed at step EXEC spawning ringboard-server: No such file or directory
Subject: Process ringboard-server could not be executed
Defined-By: systemd
Support: https://www.debian.org/support

The process ringboard-server could not be executed and failed.

The error number returned by this process is ERRNO.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Main process exited, code=exited, status=203/EXEC
Subject: Unit process exited
Defined-By: systemd
Support: https://www.debian.org/support

An ExecStart= process belonging to unit UNIT has exited.

The process' exit code is 'exited' and its exit status is 203.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Failed with result 'exit-code'.
Subject: Unit failed
Defined-By: systemd
Support: https://www.debian.org/support

The unit UNIT has entered the 'failed' state with result 'exit-code'.
ago 21 15:39:48 who systemd[1560]: Failed to start ringboard-server.service - Ringboard server.
Subject: A start job for unit UNIT has failed
Defined-By: systemd
Support: https://www.debian.org/support

A start job for unit UNIT has finished with a failure.

The job identifier is 519 and the job result is failed.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Scheduled restart job, restart counter is at 5.
Subject: Automatic restarting of a unit has been scheduled
Defined-By: systemd
Support: https://www.debian.org/support

Automatic restarting of the unit UNIT has been scheduled, as the result for
the configured Restart= setting for the unit.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Start request repeated too quickly.
ago 21 15:39:48 who systemd[1560]: ringboard-server.service: Failed with result 'exit-code'.
Subject: Unit failed
Defined-By: systemd
Support: https://www.debian.org/support

The unit UNIT has entered the 'failed' state with result 'exit-code'.
ago 21 15:39:48 who systemd[1560]: Failed to start ringboard-server.service - Ringboard server.
Subject: A start job for unit UNIT has failed
Defined-By: systemd
Support: https://www.debian.org/support

A start job for unit UNIT has finished with a failure.

The job identifier is 538 and the job result is failed.

BUT I got it working changing the line ''ExecStart'' back to:

/bin/sh -c 'PATH=~/.cargo/bin:$PATH exec ringboard-server'

I have no idea why that works :(

SUPERCILEX commented 2 months ago

@vi yes, handling cgroups gracefully means using them correctly. You're suggesting blindly ignoring the failures which is bad engineering. If I'm going to ignore an error, I want to know why the error occurs and I'll only ignore it if it's something outside of my control. Otherwise, it means I caused the error and it's my bug.

SUPERCILEX commented 2 months ago

@AguirreIF did you run the sed commands? The error there is because the binary ringboard-server wasn't found which is normal. The sed commands are supposed to inline the absolute path. Anyway, does the service starts up normally with the /bin/sh path stuff (or an absolute path)? If so, then the error is fixed right? I assume your original error was that the service refused to start with the cgroup perms error?

AguirreIF commented 2 months ago

@AguirreIF did you run the sed commands? The error there is because the binary ringboard-server wasn't found which is normal. The sed commands are supposed to inline the absolute path. Anyway, does the service starts up normally with the /bin/sh path stuff (or an absolute path)? If so, then the error is fixed right? I assume your original error was that the service refused to start with the cgroup perms error?

Yes, now it works fine, I was missing an absolute path, sorry :(

Thanks very much for such an amazing clipboard manager (I was using clipster until yesterday).

SUPERCILEX commented 2 months ago

No worries and thanks for testing this!

I think this will still be broken when used outside of systemd if the current cgroup is owned by root, but I kinda want to say that's user error. Will wait for someone to run into that case so I can understand it better, but that's where @vi's approach of ignoring the failure might make sense as creating a user cgroup just to run the server is going to be too much of a hassle. Seems like basically everything uses systemd though so eh: https://en.wikipedia.org/wiki/Systemd#Adoption

vi commented 2 months ago

I currently use Ringboard without systemd and (mostly) without dbus.

SUPERCILEX commented 2 months ago

But you're on cgroup v1 right? So there shouldn't be any issues—I'd want to know why somebody uses cgroup v2 without systemd.

SUPERCILEX commented 2 months ago

Though realistically I'll have to ignore perm denied errors eventually. But I wouldn't have discovered the systemd slice fix if I had just ignored errors from the start! So let's let it bake for a while longer and if somebody comes along with permission errors and they're not using systemd, then I think it's reasonable to ignore the error as we'll have caught most of the bugs in my cgroup code by then.

vi commented 2 months ago

But you're on cgroup v1 right?

Yes, that's probably by default on current Debian Stable.

Previously I remember mounting cgroups to nonstandard location (when I was using self-built kernel and customized the system more drastically).

I expect things to work without cgroups at all if needed, unless it's a container or security software.

bake for a while longer

Maybe just log them with higher severity that would be visible by default (including in journalctl logs when using systemd integration).

why somebody uses cgroup v2 without systemd

Maybe somebody would be building their own distro (Linux From Scratch-style) and choosing v2 instead of v1 because of 2 > 1.

Maybe somebody running embedded distro with a small GUI (and using only ringboard-server, with a custom other components) that does for some reason include cgroups, but not the rest of the stack. / Those users can typically just patch the code when needed, but io_uring peculiarities may slow the patching down. /

It is hard to know how users would use your program, even outside Linux Desktop. Within Linux Desktop, custom configurations may be a notable percentage of users, especially when when speaking about a program that shows some new approach.

SUPERCILEX commented 2 months ago

I expect things to work without cgroups at all if needed

I sympathize with wanting to customize things (this is a custom clipboard manager after all), but I also feel like cgroups being mounted at /sys/fs/cgroup is a fairly basic requirement.

just log them

Nobody looks at logs when things are working. :)

custom configurations may be a notable percentage of users

Yes ish, but I'd rather deal with that in real issues than hypotheticals. Also like I said I'll eventually add a check for EACCES when trying to open the cgroup file which would then mean all that's required is mounting cgroups at /sys/fs/cgroup.

msirringhaus commented 2 months ago

Just a note: Some distros like openSUSE have the whole memory-module deactivated by default for performance reasons. So even though all other cgroup-files are there, the pressure-file might still be absent.

SUPERCILEX commented 2 months ago

Hmmm, ok if that comes up then I guess we'll catch not found errors.

webstrand commented 1 month ago

Yep, I'm on opensuse, and this application does not work, even with the slice fix above. I'm on cgroup2, not hybrid, and using the latest install script.

Error: an I/O error occurred
│
╰─▶ No such file or directory (os error 2)
    ╰╴Failed to open pressure file: /sys/fs/cgroup/user.slice/user-1000.slice/session-3.scope/memory.pressure
SUPERCILEX commented 1 month ago

Alright, I think we've got most of the bugs ironed out so we can just make it fully optional. Sadly not going to have access to a computer for the next few weeks so I can't actually fix this.

@vi if you wanted to submit a PR which adds a match statement with a condition on file not found (look around the repo to see where else I've done this) and break out of the scope, I'd accept it.

vi commented 1 month ago

I assume denied access to the file should also cause the fallback, not just specifically not found file.

SUPERCILEX commented 1 month ago

Yeah I guess we can do EPERM and ENOTFOUND together. I'd say just throw in a warn statement for both: Unable to prepare low memory listener.

SUPERCILEX commented 2 weeks ago

Fixed in 36ea5ae.