Closed seqizz closed 4 months ago
Yeah, there is something amiss in the way sd-switch works. That is the reason why we haven't been able to make it the default. I plan to more or less rewrite it relatively soon and hopefully that should make it more robust. At least it should make it give more useful error messages.
At the moment I'm focusing on getting https://github.com/nix-community/home-manager/pull/5024 and https://github.com/nix-community/home-manager/pull/4976 in. After that I'll get on sd-switch.
@seqizz Would you mind trying out the switch-to-zbus branch of sd-switch and see if it gives a more helpful error message?
If you are using a Nix Flake based setup then you can override the existing sd-switch
using an overlay. Similar to how I do it here: https://git.sr.ht/~rycee/configurations/commit/34b13ff0054a8a3a26b5b74b83fd703fbf467de7#flake.nix
Sadly building it failed with:
last 10 log lines:
> Finished cargoSetupPostPatchHook
> Running phase: updateAutotoolsGnuConfigScriptsPhase
> Running phase: configurePhase
> Running phase: buildPhase
> Executing cargoBuildHook
> ++ env CC_X86_64_UNKNOWN_LINUX_GNU=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/cc CXX_X86_64_UNKNOWN_LINUX_GNU=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/c++ CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/cc CC_X86_64_UNKNOWN_LINUX_GNU=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/cc CXX_X86_64_UNKNOWN_LINUX_GNU=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/c++ CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/cc CARGO_BUILD_TARGET=x86_64-unknown-linux-gnu HOST_CC=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/cc HOST_CXX=/nix/store/i6zjqpawh725z1lyg3alglzlabnzbjx7-gcc-wrapper-12.3.0/bin/c++ cargo build -j 8 --target x86_64-unknown-linux-gnu --frozen --profile release
> error: package `zvariant_derive v4.0.0` cannot be built because it requires rustc 1.75 or newer, while the currently active rustc version is 1.73.0
> Either upgrade to rustc 1.75 or newer, or use
> cargo update -p zvariant_derive@4.0.0 --precise ver
> where `ver` is the latest version of `zvariant_derive` supporting rustc 1.73.0
I also gave it unstable repo, but I think I should also override rust with even newer version?
@seqizz Hmm, it should build OK with a recent nixpkgs-unstable
, I'm using that in my setup:
nixpkgs-unstable.url = "github:NixOS/nixpkgs/nixpkgs-unstable";
sd-switch = {
url = "sourcehut:~rycee/sd-switch/switch-to-zbus";
inputs.nixpkgs.follows = "nixpkgs-unstable";
};
And running inside a nixpkgs-unstable
checkout:
$ git log -1
commit f33dd27a47ebdf11dc8a5eb05e7c8fbdaf89e73f (HEAD, origin/nixpkgs-unstable)
Merge: fa15b53dbea5 47abf0334033
Author: Bobby Rong <rjl931189261@126.com>
Date: Tue Feb 20 13:36:14 2024 +0800
Merge pull request #288704 from Aleksanaa/cinnamon.cinnamon-control-center
cinnamon.cinnamon-control-center: fix tls support in online accounts
$ nix run .#rustc -- --version
rustc 1.75.0 (82e1608df 2023-12-21) (built from a source tarball)
You can also let it use its own nixpkgs (i.e., not including the follows
line)β¦
I am clearly doing something wrong on my flake setup, I even had to fix the cargohash..
I removed the follows line, it has to work since whole idea of flake is to not have this kind of issues: https://git.gurkan.in/gurkan/nixos-system-flake/commit/ac0cbb38055e6376f9cedc7e17fabdad5088fdb6
(btw thanks for taking a stab at this)
Hmm, looks a bit too complicated. I think replacing the whole sd-switch = prev.sd-switchβ¦
thing by something like
sd-switch = inputs.sd-switch-src.packages.${final.system}.default
may work better.
Yep, it did the trick, thanks!
Now I had:
Error: Error switching
Caused by:
0: Failed to create systemd manager proxy
1: org.freedesktop.DBus.Error.Spawn.ChildExited: Process org.freedesktop.systemd1 exited with status 1
Thanks! That's quite helpful!
I wonder if the systemd user session is actually listening at all on /tmp/dbus-pCtHZePHSo
. Typically I think it prefers /run/user/1000/bus
. Could you check if there is a socket file on the /run/user/1000/bus
path?
Also, could you check if you have dbus-run-session
running? Something like ps ax | grep dbus-run-session
.
~> ps aux | grep run-session
gurkan 2123 0.0 0.0 3968 1920 ? S Feb20 0:00 /nix/store/qxvy6vc2x65f1lj49pxvdsnc2y4d6772-dbus-1.14.10/bin/dbus-run-session /nix/store/rz4n14d75fghwdf1l4jn5viri6k4yl4h-myAwesome-master/bin/awesome
Also
~> ls -la /run/user/1000/bus
srw-rw-rw- 1 gurkan gurkan 0 Feb 20 20:27 /run/user/1000/bus=
But at the same time:
~> sudo find /tmp -type s -name "dbus*" -exec ls -la {} \;
srwxrwxrwx 1 gurkan gurkan 0 Feb 20 20:27 /tmp/dbus-H9aaJRBiM5
Not sure why this one exists π€
I have the same error in my Ubuntu-22.04 WSL but the file doesn't exist
ls: cannot access '/run/user/1000/bus': No such file or directory
And no run-session
zweili@co-ws-con4:~$ ps aux | grep run-session
zweili 32704 0.0 0.0 4028 2128 pts/2 S+ 17:33 0:00 grep run-session
Thanks for the feedback! Could you (both) try see where systemd thinks the session bus is? For example, on my system:
$ systemctl --user show-environment | grep DBUS_SESSION_BUS_ADDRESS
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
Another alternative that directly shows the socket that systemd actually has opened:
$ sudo lsof -a -U -p $(pgrep -U $UID 'systemd')
β¦
systemd 2269 rycee 10u unix 0xffffa263cce7dd80 0t0 7662 /run/user/1000/bus type=STREAM (LISTEN)
β¦
@Nebucatnetzer Could you double check if your zweili user has user ID 1000?
$ echo $UID
1000
$ echo $XDG_RUNTIME_DIR
/run/user/1000
Sure, no problem. The first command didn't show anything but echo on the variable did work.
β― systemctl --user show-environment | grep DBUS_SESSION_BUS_ADDRESS
β― echo $DBUS_SESSION_BUS_ADDRESS
unix:path=/run/user/1000/bus
β― sudo lsof -a -U -p $(pgrep -U $UID 'systemd')
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 391 zweili 1u unix 0xffff898e0001bfc0 0t0 77567 type=STREAM
systemd 391 zweili 2u unix 0xffff898e0001bfc0 0t0 77567 type=STREAM
systemd 391 zweili 3u unix 0xffff898de0708cc0 0t0 77585 type=DGRAM
systemd 391 zweili 16u unix 0xffff898de064cc80 0t0 79009 /run/user/1000/systemd/notify type=DGRAM
systemd 391 zweili 17u unix 0xffff898de064d0c0 0t0 79010 type=DGRAM
systemd 391 zweili 18u unix 0xffff898dd300a200 0t0 79012 /run/user/1000/systemd/private type=STREAM
systemd 391 zweili 19u unix 0xffff898dd300a640 0t0 79014 type=STREAM
systemd 391 zweili 21u unix 0xffff898de064a200 0t0 79011 type=DGRAM
systemd 391 zweili 22u unix 0xffff898dd3008880 0t0 79024 /run/user/1000/gnupg/S.gpg-agent.ssh type=STREAM
systemd 391 zweili 25u unix 0xffff898dd300aec0 0t0 79028 /run/user/1000/pk-debconf-socket type=STREAM
systemd 391 zweili 26u unix 0xffff898dd3009dc0 0t0 79018 /run/user/1000/gnupg/S.dirmngr type=STREAM
systemd 391 zweili 27u unix 0xffff898dd300b740 0t0 79030 /run/user/1000/snapd-session-agent.socket type=STREAM
systemd 391 zweili 28u unix 0xffff898dd300aa80 0t0 79020 /run/user/1000/gnupg/S.gpg-agent.browser type=STREAM
systemd 391 zweili 29u unix 0xffff898dd300c400 0t0 79026 /run/user/1000/gnupg/S.gpg-agent type=STREAM
systemd 391 zweili 30u unix 0xffff898dd3008cc0 0t0 79022 /run/user/1000/gnupg/S.gpg-agent.extra type=STREAM
β― echo $UID
1000
~
β― echo $XDG_RUNTIME_DIR
/run/user/1000/
On my side:
~> systemctl --user show-environment | grep DBUS_SESSION_BUS_ADDRESS
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
And
~> sudo lsof -a -U -p $(pgrep -U $UID 'systemd')
...
systemd 1935 gurkan 28u unix 0xffff9682023c0880 0t0 6812 /run/user/1000/bus type=STREAM (LISTEN)
...
@seqizz Could you try updating the sd-switch flake and see if it makes a difference. I added a commit that makes it prefer /run/user/$UID/bus
if it exists, otherwise it uses DBUS_SESSION_BUS_ADDRESS
. A bit hacky but it might work π
@Nebucatnetzer Hmm, that is very interesting. You have
$ echo $DBUS_SESSION_BUS_ADDRESS
unix:path=/run/user/1000/bus
but /run/user/1000/bus
does not exist? What does busctl --user --list | grep systemd1
say? Do you have the dbus-user-session
package installed? Are you able to run regular systemctl commands, like systemctl --user status
?
Yep, no crashes π
...
Creating home file links in /home/gurkan
Activating onFilesChange
Activating reloadSystemd
Starting units: hm-graphical-session.target, tray.target
So, the problem is that home-manager can't grab DBUS_SESSION_BUS_ADDRESS
for some reason? We can dig into it if you'd like to (e.g. dump the environment from sd-switch itself) but since the current workaround works, that's fine for me too. Thanks!
@seqizz I think the issue is that you actually have two user D-Bus sessions. One that is started at login, located at /run/user/1000/bus
. This is the one that is used by systemd. My guess is that if you login on the Linux console you will get that in DBUS_SESSION_BUS_ADDRESS
.
But when your graphical session starts up, it will also start a new D-Bus session using dbus-run-session
, this is the one that ends up at /tmp/dbus-pCtHZePHSo
and overwrites the "correct" DBUS_SESSION_BUS_ADDRESS
. You can see this in the paste you did earlier:
~> ps aux | grep run-session
gurkan 2123 0.0 0.0 3968 1920 ? S Feb20 0:00 /nix/store/qxvy6vc2x65f1lj49pxvdsnc2y4d6772-dbus-1.14.10/bin/dbus-run-session /nix/store/rz4n14d75fghwdf1l4jn5viri6k4yl4h-myAwesome-master/bin/awesome
I think the proper solution is to remove the use of dbus-run-session
but for now perhaps the hack I added in sd-switch works. I imagine you are not the only one with this issue.
In principle I think all occurrences of dbus-run-session
should be removed from Nixpkgs, except possibly in some test cases.
Edit: To summarize, sd-switch can grab DBUS_SESSION_BUS_ADDRESS
just fine, the problem is that it lies and systemd is connected to a different D-Bus address.
@seqizz Could you try updating the sd-switch flake and see if it makes a difference. I added a commit that makes it prefer
/run/user/$UID/bus
if it exists, otherwise it usesDBUS_SESSION_BUS_ADDRESS
. A bit hacky but it might work π@Nebucatnetzer Hmm, that is very interesting. You have
$ echo $DBUS_SESSION_BUS_ADDRESS unix:path=/run/user/1000/bus
but
/run/user/1000/bus
does not exist? What doesbusctl --user --list | grep systemd1
say? Do you have thedbus-user-session
package installed? Are you able to run regular systemctl commands, likesystemctl --user status
?
zweili@co-ws-con4:~$ busctl --user --list | grep systemd1
Failed to connect to bus: No such file or directory
dbus-user-session
is not installed
Systemd works fine as far as I can tell.
zweili@co-ws-con4:~$ systemctl --user status
β co-ws-con4
State: running
Jobs: 0 queued
Failed: 0 units
Since: Fri 2024-02-23 08:37:23 CET; 1min 28s ago
CGroup: /user.slice/user-1000.slice/user@1000.service
ββapp.slice
β ββssh-agent.service
β β ββ430 /nix/store/9g3y8bvpp39z5f18v80znnbh49vc281a-openssh-9.6p1/bin/ssh-agent -D -a /run/user/1000/ssh-agent
β ββemacs.service
ββinit.scope
@Nebucatnetzer Ok, seems systemctl and systemd uses /run/user/1000/systemd/private
to communicate when there is no user D-Bus session available. I think the only way for sd-switch to work on such a system would be to run systemctl
commands and parse its output. If you are able to, could you try installing dbus-user-session
and see if that helps?
I'm somewhat reluctant to go away from using D-Bus to communicate with systemd since it feels more robust. But maybe to have it as a fallback for systems without D-Bus? π
After installing the package it works fine. π
I think the proper solution is to remove the use of
dbus-run-session
but for now perhaps the hack I added in sd-switch works. I imagine you are not the only one with this issue.
I tested this properly and you're 100% right. Removed the "dbus-run-session" from xsession.windowManager.command
and everything still worked, with single dbus socket. That was an old mistake of mine I assume.
Anyway, thanks again for digging this. And yes, probably other systems will be rescued from similar multi-bus confusion with this check π
Since this one is linked to other magically-resolved issues and the workaround will be coming with v0.4.0, feel free to close this issue.
Thank you for your contribution! I marked this issue as stale due to inactivity. Please be considerate of people watching this issue and receiving notifications before commenting 'I have this issue too'. We welcome additional information that will help resolve this issue. Please read the relevant sections below before commenting.
* If this is resolved, please consider closing it so that the maintainers know not to focus on this. * If this might still be an issue, but you are not interested in promoting its resolution, please consider closing it while encouraging others to take over and reopen an issue if they care enough. * If you know how to solve the issue, please consider submitting a Pull Request that addresses this issue.
* If you are also experiencing this issue, please add details of your situation to help with the debugging process. * If you know how to solve the issue, please consider submitting a Pull Request that addresses this issue.
Don't be afraid to manually close an issue, even if it holds valuable information. Closed issues stay in the system for people to search, read, cross-reference, or even reopen β nothing is lost! Closing obsolete issues is an important way to help maintainers focus their time and effort.
Nixpkgs unstable now has sd-switch version 0.4.0, which hopefully resolves this issue. I'll close, please comment if the issue remains.
Are you following the right branch?
Is there an existing issue for this?
Issue description
Sorry for another clone of the periodic ghost issue, but after opting to use
sd-switch
, I see following at the end of allhome-manager switch
operation:It is the somewhat known error referred to dbus you see in other issues, but there was no resolution so far & it was magically disappearing on those reported. On my system this is sadly reproducible.
Problem is, the system has an active dbus session (+ no other weirdness as far as I can see). Some information from the system (collected from troubleshooting steps of similarly reported issues e.g.: https://github.com/nix-community/home-manager/issues/371)
systemctl --user daemon-reload
andsystemctl --user list-units
Thanks for any tips!
Maintainer CC
No response
System information