oVirt / ovirt-ansible-hosted-engine-setup

Apache License 2.0
33 stars 43 forks source link

WIP: Check OVF_STORE volume status: Try 20 minutes #336

Closed didib closed 4 years ago

didib commented 4 years ago

Try 120 times instead of 12 (with 10 seconds delay in between).

hosted-engine deploy fails for some time now in CI at this point, I wonder if this is simply because the engine needs more time for some reason to update OVF_STORE.

kobihk commented 4 years ago

I had the same problem as I mentioned in: https://github.com/oVirt/ovirt-ansible-hosted-engine-setup/pull/335#issuecomment-652322983

and when I try to run without this patch I failed as you can see: 02:30:43 TASK [ovirt.hosted_engine_setup : Check OVF_STORE volume status] *** 02:30:44 FAILED - RETRYING: Check OVF_STORE volume status (12 retries left). 02:30:55 FAILED - RETRYING: Check OVF_STORE volume status (11 retries left). 02:31:06 FAILED - RETRYING: Check OVF_STORE volume status (10 retries left). 02:31:16 FAILED - RETRYING: Check OVF_STORE volume status (9 retries left). 02:31:27 FAILED - RETRYING: Check OVF_STORE volume status (8 retries left). 02:31:38 FAILED - RETRYING: Check OVF_STORE volume status (7 retries left). 02:31:49 FAILED - RETRYING: Check OVF_STORE volume status (6 retries left). 02:32:00 FAILED - RETRYING: Check OVF_STORE volume status (5 retries left). 02:32:11 FAILED - RETRYING: Check OVF_STORE volume status (4 retries left). 02:32:23 FAILED - RETRYING: Check OVF_STORE volume status (3 retries left). 02:32:34 FAILED - RETRYING: Check OVF_STORE volume status (2 retries left). 02:32:45 FAILED - RETRYING: Check OVF_STORE volume status (1 retries left). 02:32:56 failed: [sample.com] ...

but with this patch it succeeded as you can see below: 01:30:47 TASK [ovirt.hosted_engine_setup : Check OVF_STORE volume status] *** 01:30:49 FAILED - RETRYING: Check OVF_STORE volume status (120 retries left). 01:30:59 FAILED - RETRYING: Check OVF_STORE volume status (119 retries left). 01:31:10 FAILED - RETRYING: Check OVF_STORE volume status (118 retries left). 01:31:21 FAILED - RETRYING: Check OVF_STORE volume status (117 retries left). ... ... 01:47:24 FAILED - RETRYING: Check OVF_STORE volume status (29 retries left). 01:47:35 FAILED - RETRYING: Check OVF_STORE volume status (28 retries left). 01:47:46 FAILED - RETRYING: Check OVF_STORE volume status (27 retries left). 01:47:57 changed: [example01.com] => (item={...}) 01:47:58 FAILED - RETRYING: Check OVF_STORE volume status (120 retries left). 01:48:09 FAILED - RETRYING: Check OVF_STORE volume status (119 retries left). 01:48:19 FAILED - RETRYING: Check OVF_STORE volume status (118 retries left). 01:48:30 FAILED - RETRYING: Check OVF_STORE volume status (117 retries left). 01:48:41 FAILED - RETRYING: Check OVF_STORE volume status (116 retries left). ... ... 01:51:15 FAILED - RETRYING: Check OVF_STORE volume status (102 retries left). 01:51:26 FAILED - RETRYING: Check OVF_STORE volume status (101 retries left). 01:51:37 FAILED - RETRYING: Check OVF_STORE volume status (100 retries left). 01:51:48 FAILED - RETRYING: Check OVF_STORE volume status (99 retries left). 01:51:58 changed: [example01.com] => (item={...})

IMHO its better to create a large timeout that in edge case failed instead of strict timeout that failed from time to time