Open oati opened 2 years ago
Maybe this is as simple as adding dbus.service
as a dependency to the systemd service declaration in nixos/default.nix
.
Also, if dbus
is a dependency of home-manager
, shouldn't it be setting services.dbus.enable = true
in nixos?
Hi,
Does the error reproduce reliably?
As far as I've tested, it happens on every boot.
This might be a race condition between the (system) home manager service and the user dbus service.
I think I've experienced this issue in the past due to sloppy home.activation
hooks.
If memory serves, it was something like...
home.activation."autorandr" = lib.hm.dag.entryAfter [ "writeBoundary" ] ''
if ${pkgs.systemd}/bin/systemctl --user is-active -q graphical-session.target; then
$DRY_RUN_CMD ${hook} -c || :
fi
'';
but
home.activation."autorandr" = lib.hm.dag.entryAfter [ "writeBoundary" ] ''
if $DRY_RUN_CMD ${pkgs.systemd}/bin/systemctl --user is-active -q graphical-session.target; then
$DRY_RUN_CMD ${hook} -c || :
fi
would not.
This might be a race condition between the (system) home manager service and the user dbus service.
This is indeed the case, hm-set-env
https://github.com/nix-community/home-manager/blob/586ac1fd58d2de10b926ce3d544b3179891e58cb/nixos/default.nix#L66 imports dbus related environment variables from the system user session, thus the user dbus services has to be active before hm get activated.
I have a similar error on unstable where it complains about dbus-daemon
(executable or socket?) not found, instead of no medium during the dconfSettings activation script.
Nov 13 11:11:10 shun systemd[1]: Starting Home Manager environment for mats...
Nov 13 11:11:10 shun hm-activate-mats[703]: Starting Home Manager activation
Nov 13 11:11:10 shun hm-activate-mats[703]: Activating checkFilesChanged
Nov 13 11:11:10 shun hm-activate-mats[703]: Activating checkLinkTargets
Nov 13 11:11:11 shun hm-activate-mats[703]: Activating writeBoundary
Nov 13 11:11:11 shun hm-activate-mats[703]: Activating installPackages
Nov 13 11:11:11 shun hm-activate-mats[703]: Activating dconfSettings
Nov 13 11:11:11 shun hm-activate-mats[1104]: dbus-run-session: failed to execute message bus daemon 'dbus-daemon': No such file or directory
Nov 13 11:11:11 shun hm-activate-mats[1102]: dbus-run-session: EOF reading address from bus daemon
Nov 13 11:11:11 shun systemd[1]: home-manager-mats.service: Main process exited, code=exited, status=127/n/a
Nov 13 11:11:11 shun systemd[1]: home-manager-mats.service: Failed with result 'exit-code'.
Nov 13 11:11:11 shun systemd[1]: Failed to start Home Manager environment for mats.
I do have systemd in initrd enabled to allow TPM based LUKS unlocking which may be related to the issue.
I also get failed to execute message bus daemon 'dbus-daemon': No such file or directory
. It happens during a nixos-rebuild switch
for the other user on my laptop. I use encryption but it's not automatically unlocked via TPM.
I get the exact same error as @Myaats on boot, but the service can be started manually after. So it sounds like a race condition.
I think this PR might be related, and could solve the problem: https://github.com/nix-community/home-manager/pull/3405
The race is inherent as dbus is started by the respective systemd user session, while home-manager activation is not.
Thank you for your contribution! I marked this issue as stale due to inactivity. Please be considerate of people watching this issue and receiving notifications before commenting 'I have this issue too'. We welcome additional information that will help resolve this issue. Please read the relevant sections below before commenting.
* If this is resolved, please consider closing it so that the maintainers know not to focus on this. * If this might still be an issue, but you are not interested in promoting its resolution, please consider closing it while encouraging others to take over and reopen an issue if they care enough. * If you know how to solve the issue, please consider submitting a Pull Request that addresses this issue.
* If you are also experiencing this issue, please add details of your situation to help with the debugging process. * If you know how to solve the issue, please consider submitting a Pull Request that addresses this issue.
Don't be afraid to manually close an issue, even if it holds valuable information. Closed issues stay in the system for people to search, read, cross-reference, or even reopen – nothing is lost! Closing obsolete issues is an important way to help maintainers focus their time and effort.
I am getting the same error as the original issue on NixOS:
> journalctl --boot --no-pager --unit home-manager-mycoolusername.service
Mar 19 14:08:09 nixosmachine systemd[1]: Starting Home Manager environment for mycoolusername...
Mar 19 14:08:09 nixosmachine hm-activate-mycoolusername[1052]: Starting Home Manager activation
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating checkFilesChanged
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating checkLinkTargets
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating unmountPersistentStoragePaths
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating createAndMountPersistentStoragePaths
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating writeBoundary
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating createGpgHomedir
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating linkGeneration
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Creating profile generation 1
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Creating home file links in /home/mycoolusername
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1052]: Activating batCache
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1717]: No themes were found in '/home/mycoolusername/.config/bat/themes', using the default set
Mar 19 14:08:10 nixosmachine hm-activate-mycoolusername[1717]: No syntaxes were found in '/home/mycoolusername/.config/bat/syntaxes', using the default set.
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1717]: Writing theme set to /home/mycoolusername/.cache/bat/themes.bin ... okay
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1717]: Writing syntax set to /home/mycoolusername/.cache/bat/syntaxes.bin ... okay
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1717]: Writing metadata to folder /home/mycoolusername/.cache/bat ... okay
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1052]: Activating createXdgUserDirectories
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1052]: Activating installPackages
Mar 19 14:08:11 nixosmachine hm-activate-mycoolusername[1052]: Activating onFilesChange
Mar 19 14:08:11 nixosmachine systemctl[1779]: Failed to connect to bus: No medium found
Mar 19 14:08:11 nixosmachine systemd[1]: home-manager-mycoolusername.service: Main process exited, code=exited, status=1/FAILURE
Mar 19 14:08:11 nixosmachine systemd[1]: home-manager-mycoolusername.service: Failed with result 'exit-code'.
Mar 19 14:08:11 nixosmachine systemd[1]: Failed to start Home Manager environment for mycoolusername.
Mar 19 14:08:11 nixosmachine systemd[1]: home-manager-mycoolusername.service: Consumed 1.514s CPU time, no IP traffic.
My configuration is similar to that of @oati and the symptoms are the same:
nixpkgs-unstable
. NixOS and Home Manager is on stateVersion = 24.05
sudo systemctl restart home-manager-mycoolusername.service
) with no error./home
are on tmpfs.Other notes:
install-iso
format from nixos-generators to make a live environment (which also has root and home on tmpfs)It does look like a race of some sort, which becomes deterministic with root on tmpfs for some reason. The error still shows up when a very long sleep is inserted right after onFilesChange
in Home Manager's activation script:
# Same error appears with this in the user's Home Manger configuration
home.activation = {
sleepyTime = inputs.home-manager.lib.hm.dag.entryBetween ["onFilesChange"] ["installPackages"] ''
echo "Sleeping..."
sleep 100
'';
};
The race is inherent as dbus is started by the respective systemd user session, while home-manager activation is not.
If this is the case, the contents of the onFilesChange
section in the Home Manager activation script (a systemd system service) would need to be run after the user's DBus service is ready (a systemd user service). A straightforward solution would be to move onFilesChange
logic to its own systemd user service, and set it to run after the user DBus service, but is that possible or desired? I am unsure if that change might have side effects or break something else.
I'm able to reproduce this error using a normal installation (without tmpfs). I'm using dbus-broker as the dbus implementation. I also have apparmor enabled and I'm setting services.dbus.apparmor = "enabled"
.
Doing some more digging and testing for my specific issue, I found the lines in my Home Manager configuration that causes the Failed to connect to bus: No medium found
error:
xdg.configFile."fnott/fnott.ini" = {
onChange = "${inputs.nixpkgs.systemd}/bin/systemctl --user restart fnott";
};
xdg.configFile."path/to/home-manager-managed/file".onChange = "${nixpkgs.systemd}/bin/systemctl --user <any subcommand>"
causes the error in my config. The issue is in the Home Manager activation script and the effects of three sections inside:
checkFilesChanged
always marks $HOME/fnott/fnott.ini
as changed, because the machine has impertinent /home
and $HOME/fnott/fnott.ini
does not exist yet. This is important in a later step.Relevent code in the Home Manager activation script:
_iNote "Activating %s" "checkFilesChanged"
function _cmp() {
if [[ -d $1 && -d $2 ]]; then
diff -rq "$1" "$2" &> /dev/null
else
cmp --quiet "$1" "$2"
fi
}
declare -A changedFiles
_cmp '/nix/store/zgk3xkxn2pqqdwf1v18a7l89ys0fxi5r-fnott.ini' '/home/mycoolusername'/'.config/fnott/fnott.ini' \
&& changedFiles['.config/fnott/fnott.ini']=0 \
|| changedFiles['.config/fnott/fnott.ini']=1
unset -f _cmp
The important line is changedFiles['.config/fnott/fnott.ini']=1
which always runs.
$HOME/fnott/fnott.ini
is actually created later, in DAG entry linkGeneration
, but this is too late because the file is already marked as changed.
Later, in DAG entry onFilesChange
, the systemctl --user
command runs and fails because the systemd user session has not started yet. Note that the systemctl --user
command is inside an if condition ${changedFiles['.config/fnott/fnott.ini']} == 1
, but that condition is always true because of changedFiles['.config/fnott/fnott.ini']=1
from DAG entry checkFilesChanged
earlier.
_iNote "Activating %s" "onFilesChange"
if (( ${changedFiles['.config/fnott/fnott.ini']} == 1 )); then
if [[ -v DRY_RUN || -v VERBOSE ]]; then
echo "Running onChange hook for" '.config/fnott/fnott.ini'
fi
if [[ ! -v DRY_RUN ]]; then
/nix/store/4npvfi1zh3igsgglxqzwg0w7m2h7sr9b-systemd-255.4/bin/systemctl --user restart fnott
fi
fi
Two possible solutions:
systemctl --user
command only if the systemd user daemon is running. This does mean the command will never run at boot, even if the fnott config file changed for any reason, but this is perfectly fine because fnott has not even started yet, and fnott will read the config file when it does start later. Simple enough to implement, but feels like a workaround because the second solution is better: # The commands that check whether the systemd user daemon is running are inspired by https://github.com/nix-community/home-manager/blob/057117a401a34259c9615ce62218aea7afdee4d3/modules/systemd.nix#L343-L350
xdg.configFile."fnott/fnott.ini" = {
onChange = let
ensureRuntimeDir = "XDG_RUNTIME_DIR=\${XDG_RUNTIME_DIR:-/run/user/$(id -u)}";
systemctlUserCmd = "${ensureRuntimeDir} ${inputs.nixpkgs.systemd}/bin/systemctl --user";
realCmd = "${systemctlUserCmd} restart fnott";
in ''
systemdUserStatus=$(${systemctlUserCmd} is-system-running 2>&1 || true)
if [[ $systemdUserStatus == 'running' || $systemdUserStatus == 'degraded' ]]; then
${realCmd}
else
echo "systemd user session not running; skipping fnott config reload"
fi
'';
};
onFilesChange
logic to a systemd user service. The new user service will certainly activate after the user systemd daemon is running, and can be set to activate after all impermanence files/folders are linked. This is the better solution, attacking closer to the root cause, but is more complicated to implement. This issue has discussion about doing this, and this pull request looks promising.I'm also experience this issue but only use the xdg-utils as a package without extra configuration. Manual restarting the service works without a problem.
Version: NixOS 24.11 unstable
$ journalctl -eu home-manager-usyer.service
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Starting Home Manager activation
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Activating checkFilesChanged
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Activating checkLinkTargets
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Activating writeBoundary
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Activating installPackages
Jun 30 17:55:54 latopler hm-activate-usyer[1172]: Activating dconfSettings
Jun 30 17:55:55 latopler hm-activate-usyer[1633]: dbus-daemon[1633]: [session uid=1000 pid=1633] Activating service name='ca.desrt.dconf' requested by ':1.0' (uid=1000 pid=1637 comm="/nix/store/l93v1f908siryjwp0s5b>
Jun 30 17:55:55 latopler hm-activate-usyer[1633]: dbus-daemon[1633]: [session uid=1000 pid=1633] Successfully activated service 'ca.desrt.dconf'
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Activating linkGeneration
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Cleaning up orphan links from /home/usyer
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: No change so reusing latest profile generation 276
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Creating home file links in /home/usyer
Jun 30 17:55:52 latopler systemd[1]: Starting Home Manager environment for usyer...
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Activating onFilesChange
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Activating reloadSystemd
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: User systemd daemon not running. Skipping reload.
Jun 30 17:55:55 latopler hm-activate-usyer[1172]: Activating setupEtc
Jun 30 17:55:55 latopler systemctl[2275]: Failed to connect to bus: No medium found
Jun 30 17:55:55 latopler systemd[1]: home-manager-usyer.service: Main process exited, code=exited, status=1/FAILURE
Jun 30 17:55:55 latopler systemd[1]: home-manager-usyer.service: Failed with result 'exit-code'.
Jun 30 17:55:55 latopler systemd[1]: Failed to start Home Manager environment for usyer.
Jun 30 17:55:55 latopler systemd[1]: home-manager-usyer.service: Consumed 1.175s CPU time, no IP traffic.
Are you following the right branch?
Is there an existing issue for this?
Issue description
I started noticing that home manager was failing on my boot screen. Restarting the service does not result in any failures.
I'm running tmpfs on root, which might be a part of why it fails every time.
Maintainer CC
No response
System information