NixOS / nixpkgs

Nix Packages collection & NixOS
MIT License
17.73k stars 13.86k forks source link

NixOS tests waitForUnit does not handle oneshots #62155

Open arianvp opened 5 years ago

arianvp commented 5 years ago

Issue description

$machine->waitForUnit(<oneshot>) fails when waiting for a unit that doesn't linger after activating. I need a way to test that a oneshot fired, but I currently can't do that.

Steps to reproduce

The following test https://github.com/NixOS/nixpkgs/commit/55fb87d2b871db9f4bcb0e79651acf0228c9de52

fails with:

machine# [    7.379766] hello[606]: Hello, world!
(sic)
machine# [    8.686200] systemd[1]: Reached target Multi-User System.
machine# [    8.690688] systemd[1]: Startup finished in 3.896s (kernel) + 4.760s (userspace) = 8.656s.
machine: running command: systemctl --no-pager show "hello.service"

machine: running command: systemctl --no-pager show "hello.service"
machine: exit status 0
(0.05 seconds)
machine: running command: systemctl list-jobs --full 2>&1
machine: exit status 0
(0.03 seconds)
machine: running command: systemctl --no-pager show "hello.service"
machine: exit status 0
(0.03 seconds)
(10.65 seconds)
error: unit ‘hello.service’ is inactive and there are no pending jobs
(10.65 seconds)
unit ‘hello.service’ is inactive and there are no pending jobs
cleaning up
killing machine (pid 597)
(0.00 seconds)
vde_switch: EOF on stdin, cleaning up and exiting
vde_switch: Could not remove ctl dir '/build/vde1.ctl': Directory not empty
builder for '/nix/store/x1vijpwc2pqdmx2rpssmviifafxjb65d-vm-test-run-oneshot.drv' failed with exit code 255
error: build of '/nix/store/x1vijpwc2pqdmx2rpssmviifafxjb65d-vm-test-run-oneshot.drv' failed

eventhough the oneshot did fire

Technical details

Please run nix-shell -p nix-info --run "nix-info -m" and paste the results.

arianvp commented 5 years ago

This happens because oneshots go from inactive -> activating -> inactive and never reach the active state unless RemainAfterExit=true

It is hard to capture the activating -> inactive transition by just polling the CLI. A more healthy solution would be to use the DBus API to listen to the appropriate unit events such that we can very clearly capture changes in unit states. But that would mean a substantial rewrite of our testing infrastructure. Maybe someone knows something simpler

stale[bot] commented 4 years ago

Thank you for your contributions.

This has been automatically marked as stale because it has had no activity for 180 days.

If this is still important to you, we ask that you leave a comment below. Your comment can be as simple as "still important to me". This lets people see that at least one person still cares about this. Someone will have to do this at most twice a year if there is no other activity.

Here are suggestions that might help resolve this more quickly:

  1. Search for maintainers and people that previously touched the related code and @ mention them in a comment.
  2. Ask on the NixOS Discourse.
  3. Ask on the #nixos channel on irc.freenode.net.
andir commented 4 years ago

Just to please the stale bot gods: I just encountered the same error. We should still consider fixing it / coming up with a solution for it.

flokli commented 4 years ago

I did an attempt on this during last years NixCon.

The main idea was to somehow forward the Dbus system socket from outside the VM, so it's accessible from the test runner, and having something watch for events, instead of doing polling.

This never really went anywhere, but if someone would be up to experimenting with this, please speak up :-)

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

flokli commented 3 years ago

still relevant.

stale[bot] commented 3 years ago

I marked this as stale due to inactivity. → More info

danielfullmer commented 1 year ago

Still relevant

layus commented 9 months ago

Very very relevant. Do we have any other way of fixing this ?

Listening to DBUS is one way to fix it. Maybe looking at journalctl ?