oxidecomputer / propolis

VMM userspace for illumos bhyve
Mozilla Public License 2.0
181 stars 22 forks source link

phd: lspci_lifecycle_test passes incorrectly #792

Open iximeow opened 1 month ago

iximeow commented 1 month ago

disclaimer: this is because of a framework bug, not a Propolis bug. i noticed that lspci_lifecycle_test fails on an Ubuntu 22.04 guest image i'd put together, and at first thought it was wrong, but the truth is stranger than fiction...

in lspci_lifecycle_test we run both lspci and lshw: https://github.com/oxidecomputer/propolis/blob/93ed767388c4fab11af8a98ad33fdbeac4098b0c/phd-tests/tests/src/hw.rs#L21-L29

on an Ubuntu guest, the lshw assert fails because before and after messages don't match. the difference in the (rather large) strings of output is only that the machine's serial does not match after being stopped and started. i double-checked on a real instance, and a Debian 11 guest's observed value for serial is in fact the instance's ID, and that ID is stable across a stop and start. again, test bug not real bug.

the immediate issue in the test framework is that in the test we validate that lshw and lspci agree across a StopAndStart, but that action involves spawning a successor VM which makes a new TestVm and in turn gets a new id.

why in the world did this pass with Alpine or Debian images though? i'm glad you've asked!

# on Alpine:
localhost:~# lshw
-ash: lshw: not found
localhost:~#

# or on a different Alpine:
localhost:~# sudo lspci -vvx
-sh: sudo: not found
localhost:~#

# on Debian:
root@debian:~# lshw
-bash: lshw: command not found
root@debian:~#

assert_eq!("-ash: lshw: not found", "-ash: lshw: not found") or equivalent error from bash will pass every time :) i only have an lshw out-of-the-box on this Ubuntu image, which seems to be why it only fails there.

iximeow commented 1 month ago

... realized this morning that the immediate failure i'd observed above was because my Ubuntu guest adapter included some changes i was also going to propose: use passwordless root instead of ubuntu there. with an image configured more like the in-tree PHD adapter expects it just hangs at [sudo] password for ubuntu: and timeout fails instead :(