openconfig / featureprofiles

Feature Profiles are groups of OpenConfig paths and tests which verify their behavior
Apache License 2.0
52 stars 149 forks source link

gNOI-4.1: Software Upgrade does not wait for Supervisor Sync post reboot #770

Closed Hobbydos closed 1 year ago

Hobbydos commented 1 year ago

Describe the bug After the rebootDUT function, there should be some mechanism to wait for the Supervisor synchronization to complete before attempting verifyInstall(). This results in the DUT returning an empty string as the OS.Version for the standby supervisor.

As a brute-force method, I simply added a time.Sleep(5 * time.Minute) after the reboot and the test passed successfully.

To Reproduce Steps to reproduce the behavior:

  1. Setup gNOI server on the DUT
  2. Run OSInstall script on dual supervisor DUT
  3. Observe Got/Want in output where got is ""

Expected behavior Wait for the Standby Supervisor to be up and synchronized before attempting to fetch version

Additional context Once added, the test passed:

    osinstall_test.go:329: Transfer progress: 1185873920 bytes received by DUT
    osinstall_test.go:248: OS.Install supervisor image transfer complete.
    osinstall_test.go:116: OS.Activate complete.
    osinstall_test.go:214: DUT standby supervisor has valid preexisting image; skipping transfer.
    osinstall_test.go:114: OS.Activate standby supervisor complete.
    osinstall_test.go:153: Send DUT Reboot Request
    osinstall_test.go:174: DUT has not rebooted.
    osinstall_test.go:180: Waiting for reboot to complete...
    osinstall_test.go:174: DUT has not rebooted.
    osinstall_test.go:180: Waiting for reboot to complete...
    osinstall_test.go:180: Waiting for reboot to complete...
    osinstall_test.go:171: Reboot completed.
    osinstall_test.go:96: Waiting for Supervisors to Synchronize - 5 Minutes <----- t.Logf added wtih the 5 minute timer after reboot
    osinstall_test.go:277: OS.Verify complete
--- PASS: TestOSInstall (1046.85s)
PASS
Hobbydos commented 1 year ago

cc @robshakir @xw-g

xw-g commented 1 year ago

I see that rebootDUT already adds 10m in the waiting. If you want brutal force, maybe just add there.

I think another way (maybe better) would be continuously verifyintall, but only report failure after say (15min).

What do you think?

Hobbydos commented 1 year ago

@xw-g I'm working on a PR for this, I'll raise it as soon as it's ready.

Hobbydos commented 1 year ago

PR835 was merged - closing issue