apache / cloudstack

Apache CloudStack is an opensource Infrastructure as a Service (IaaS) cloud computing platform
https://cloudstack.apache.org/
Apache License 2.0
2.11k stars 1.11k forks source link

CheckOnHostCommand: add missing timeout setting #9677

Open rp- opened 2 months ago

rp- commented 2 months ago

Description

The new CheckOnHostCommand constructor was missing a reasonable timeout value, which meant it would fallback to the wait (1800s) timeout. On a Linstor cluster this resulted in over 15 minutes wait time until a host was recognized as down. With timeout of 20s (as the other constructor) it takes 4-5 mins for a host to become recognized as down.

Types of changes

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

Bug Severity

Screenshots (if appropriate):

How Has This Been Tested?

Failover tests (force shutdown of a host) in a Linstor cluster.

How did you try to break this feature and the system with this change?

codecov[bot] commented 2 months ago

Codecov Report

Attention: Patch coverage is 0% with 1 line in your changes missing coverage. Please review.

Project coverage is 15.11%. Comparing base (a0932b0) to head (eca66f8). Report is 13 commits behind head on 4.19.

Files with missing lines Patch % Lines
...n/java/com/cloud/agent/api/CheckOnHostCommand.java 0.00% 1 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## 4.19 #9677 +/- ## ============================================ + Coverage 15.08% 15.11% +0.02% + Complexity 11192 11190 -2 ============================================ Files 5406 5406 Lines 473215 473214 -1 Branches 61680 58585 -3095 ============================================ + Hits 71386 71521 +135 - Misses 393880 393883 +3 + Partials 7949 7810 -139 ``` | [Flag](https://app.codecov.io/gh/apache/cloudstack/pull/9677/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | Coverage Δ | | |---|---|---| | [uitests](https://app.codecov.io/gh/apache/cloudstack/pull/9677/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `4.76% <ø> (+0.46%)` | :arrow_up: | | [unittests](https://app.codecov.io/gh/apache/cloudstack/pull/9677/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) | `15.80% <0.00%> (-0.01%)` | :arrow_down: | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache#carryforward-flags-in-the-pull-request-comment) to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

weizhouapache commented 2 months ago

@blueorangutan package

sureshanaparti commented 2 months ago

@blueorangutan package

blueorangutan commented 2 months ago

@sureshanaparti a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan commented 2 months ago

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11163

rohityadavcloud commented 1 month ago

@blueorangutan package

blueorangutan commented 1 month ago

@rohityadavcloud a [SL] Jenkins job has been kicked to build packages. It will be bundled with KVM, XenServer and VMware SystemVM templates. I'll keep you posted as I make progress.

blueorangutan commented 1 month ago

Packaging result [SF]: ✔️ el8 ✔️ el9 ✔️ debian ✔️ suse15. SL-JID 11374

DaanHoogland commented 4 weeks ago

@blueorangutan test

blueorangutan commented 4 weeks ago

@DaanHoogland a [SL] Trillian-Jenkins test job (ol8 mgmt + kvm-ol8) has been kicked to run smoke tests

blueorangutan commented 4 weeks ago

[SF] Trillian test result (tid-11709) Environment: kvm-ol8 (x2), Advanced Networking with Mgmt server ol8 Total time taken: 43298 seconds Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr9677-t11709-kvm-ol8.zip Smoke tests completed. 133 look OK, 0 have errors, 0 did not run Only failed and skipped tests results shown below:

Test Result Time (s) Test File