oVirt / ovirt-ansible-collection

Ansible collection with official oVirt modules and roles
72 stars 91 forks source link

self-hosted engine deployment failed because "'elasticsearch_host' is undefined" #708

Open Kariton opened 1 year ago

Kariton commented 1 year ago
SUMMARY

Hello, I am currently facing some challenges with my oVirt deployment and would appreciate your assistance. Last week, on Thursday, I successfully deployed oVirt 4.5.4 using the self-hosted engine deployment, based on the latest oVirt Node NG 4.5.4 image. During the deployment process, I encountered and addressed issue https://github.com/oVirt/ovirt-ansible-collection/issues/695 by applying the suggested workaround, which resolved the problem.

However, upon attempting to redeploy oVirt on the same hardware using a fresh Node NG installation today, I encountered a new issue...

COMPONENT NAME
[root@node1 ~]# cat /etc/os-release
NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8.7.2206.0"
VARIANT="oVirt Node 4.5.4"
VARIANT_ID="ovirt-node"
PRETTY_NAME="oVirt Node 4.5.4"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.ovirt.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
PLATFORM_ID="platform:el8"
[root@node1 ~]#
STEPS TO REPRODUCE

Nothing special:

[root@node1 ~]# hosted-engine --deploy --4 --ansible-extra-vars=he_pause_before_engine_setup=true

Answer / input values as necessary. (when possible default)

EXPECTED RESULTS

Successful deployment of self-hosted engine.

ACTUAL RESULTS
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Wait for the host to be up]
[ INFO  ] TASK [ovirt.ovirt.hosted_engine_setup : Notify the user about a failure]
[ ERROR ] fatal: [localhost]: FAILED! => {"changed": false, "msg": "Host is not up, please check logs, perhaps also on the engine machine"}

Upon reviewing the logs, I came across the following error message:

ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-1) [b2fe7aa7-bc64-4f4f-8ea6-72e50fc0b905] EVENT_ID: VDS_INSTALL_FAILED(505), Host node1.fqdn.tld installation failed. Task If output plugin is elasticsearch, validate host address is set failed to execute. Please check logs for more details: /var/log/ovirt-engine/host-deploy/ovirt-host-deploy-ansible-20230605190016-node1.fqdn.tld-b2fe7aa7-bc64-4f4f-8ea6-72e50fc0b905.log.

and in the ovirt-host-deploy-ansible log i see this:

2023-06-05 19:02:11 CEST - TASK [oVirt.metrics/roles/ovirt_initial_validations : If output plugin is elasticsearch, validate host address is set] ***
2023-06-05 19:02:11 CEST - {
  "uuid" : "f5d43b80-d4bc-4d1e-a236-a13feb5e5044",
  "counter" : 693,
  "stdout" : "fatal: [node1.fqdn.tld]: FAILED! => {\"msg\": \"The conditional check 'elasticsearch_host == None or elasticsearch_host is undefined' failed. The error was: error while evaluating conditional (elasticsearch_host == None or elasticsearch_host is undefined): 'elasticsearch_host' is undefined. 'elasticsearch_host' is undefined\\n\\nThe error appears to be in '/usr/share/ansible/roles/oVirt.metrics/roles/ovirt_initial_validations/tasks/check_logging_collectors.yml': line 4, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n\\n- name: If output plugin is elasticsearch, validate host address is set\\n  ^ here\\n\"}",
  "start_line" : 675,
  "end_line" : 676,
  "runner_ident" : "2f90562f-8efb-4029-b0fe-57380cb5b016",
  "event" : "runner_on_failed",
  "pid" : 105603,
  "created" : "2023-06-05T17:02:11.694307",
  "parent_uuid" : "00163e0a-249f-69de-4a33-00000000069d",
  "event_data" : {
    "playbook" : "ovirt-host-deploy.yml",
    "playbook_uuid" : "f9126d92-9f95-4c1c-b173-d88b0cb91f2b",
    "play" : "all",
    "play_uuid" : "00163e0a-249f-69de-4a33-000000000002",
    "play_pattern" : "all",
    "task" : "If output plugin is elasticsearch, validate host address is set",
    "task_uuid" : "00163e0a-249f-69de-4a33-00000000069d",
    "task_action" : "debug",
    "task_args" : "",
    "task_path" : "/usr/share/ansible/roles/oVirt.metrics/roles/ovirt_initial_validations/tasks/check_logging_collectors.yml:4",
    "role" : "oVirt.metrics/roles/ovirt_initial_validations",
    "host" : "node1.fqdn.tld",
    "remote_addr" : "node1.fqdn.tld",
    "res" : {
      "msg" : "The conditional check 'elasticsearch_host == None or elasticsearch_host is undefined' failed. The error was: error while evaluating conditional (elasticsearch_host == None or elasticsearch_host is undefined): 'elasticsearch_host' is undefined. 'elasticsearch_host' is undefined\n\nThe error appears to be in '/usr/share/ansible/roles/oVirt.metrics/roles/ovirt_initial_validations/tasks/check_logging_collectors.yml': line 4, column 3, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n\n- name: If output plugin is elasticsearch, validate host address is set\n  ^ here\n",
      "_ansible_no_log" : false
    },
    "start" : "2023-06-05T17:02:11.677693",
    "end" : "2023-06-05T17:02:11.694202",
    "duration" : 0.016509,
    "ignore_errors" : null,
    "event_loop" : null,
    "uuid" : "f5d43b80-d4bc-4d1e-a236-a13feb5e5044"
  }
}

maybe related packages after [ INFO ] TASK [ovirt.ovirt.engine_setup : Update all packages]

[root@engine ~]# rpm -qa | grep ovirt
python39-ovirt-imageio-common-2.4.7-1.el8.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-4.5.4-1.el8.noarch
ovirt-engine-restapi-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.5.4-1.el8.noarch
ovirt-web-ui-1.9.3-1.el8.noarch
ovirt-vmconsole-proxy-1.0.9-1.el8.noarch
ovirt-cockpit-sso-0.1.4-2.el8.noarch
ovirt-provider-ovn-1.2.36-1.el8.noarch
ovirt-engine-websocket-proxy-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-imageio-4.5.4-1.el8.noarch
ovirt-engine-wildfly-overlay-24.0.1-1.el8.noarch
ovirt-openvswitch-ovn-common-2.15-4.el8.noarch
ovirt-ansible-collection-3.1.2-1.el8.noarch
python3-ovirt-setup-lib-1.3.3-1.el8.noarch
ovirt-engine-extension-aaa-misc-1.1.1-1.el8.noarch
ovirt-python-openvswitch-2.15-4.el8.noarch
ovirt-engine-dwh-grafana-integration-setup-4.5.7-1.el8.noarch
ovirt-engine-tools-backup-4.5.4-1.el8.noarch
ovirt-engine-backend-4.5.4-1.el8.noarch
ovirt-engine-keycloak-setup-15.0.2-6.el8.noarch
ovirt-engine-vmconsole-proxy-helper-4.5.4-1.el8.noarch
ovirt-engine-extension-aaa-ldap-1.4.6-1.el8.noarch
python3.11-ovirt-engine-sdk4-4.6.2-1.el8.x86_64
ovirt-openvswitch-ovn-2.15-4.el8.noarch
ovirt-vmconsole-1.0.9-1.el8.noarch
ovirt-openvswitch-ovn-central-2.15-4.el8.noarch
ovirt-engine-ui-extensions-1.3.7-1.el8.noarch
python3.11-ovirt-imageio-common-2.5.0-1.el8.x86_64
ovirt-engine-extensions-api-1.0.1-1.el8.noarch
ovirt-engine-extension-aaa-jdbc-1.3.0-1.el8.noarch
ovirt-dependencies-4.5.2-1.el8.noarch
ovirt-engine-setup-base-4.5.4-1.el8.noarch
ovirt-engine-dwh-setup-4.5.7-1.el8.noarch
ovirt-engine-dbscripts-4.5.4-1.el8.noarch
ovirt-engine-setup-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.5.4-1.el8.noarch
ovirt-engine-webadmin-portal-4.5.4-1.el8.noarch
ovirt-imageio-daemon-2.5.0-1.el8.x86_64
ovirt-engine-metrics-1.6.1-1.el8.noarch
python39-ovirt-engine-sdk4-4.6.0-1.el8.x86_64
python3-ovirt-engine-sdk4-4.6.2-1.el8.x86_64
ovirt-openvswitch-2.15-4.el8.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-cinderlib-4.5.4-1.el8.noarch
ovirt-engine-4.5.4-1.el8.noarch
centos-release-ovirt45-8.9-1.el8s.noarch
ovirt-imageio-client-2.5.0-1.el8.x86_64
ovirt-engine-wildfly-24.0.1-1.el8.x86_64
ovirt-imageio-common-2.5.0-1.el8.x86_64
python39-ovirt-imageio-client-2.4.7-1.el8.x86_64
python3-ovirt-engine-lib-4.5.4-1.el8.noarch
ovirt-engine-dwh-4.5.7-1.el8.noarch
ovirt-engine-keycloak-15.0.2-6.el8.noarch
ovirt-engine-tools-4.5.4-1.el8.noarch
ovirt-engine-extension-aaa-ldap-setup-1.4.6-1.el8.noarch
python3.11-ovirt-imageio-client-2.5.0-1.el8.x86_64
[root@engine ~]# rpm -qa | grep ansible
ansible-collection-ansible-posix-1.3.0-1.2.el8.noarch
ovirt-ansible-collection-3.1.2-1.el8.noarch
python38-ansible-runner-2.1.3-1.el8.noarch
ansible-collection-ansible-netcommon-2.2.0-3.2.el8.noarch
ansible-runner-2.1.3-1.el8.noarch
ansible-core-2.15.0-1.el8.x86_64
ansible-collection-ansible-utils-2.3.0-2.2.el8.noarch
dacianstremtan commented 1 year ago

I have the same problem on Rocky 8.8 node Node OS:

NAME="Rocky Linux"
VERSION="8.8 (Green Obsidian)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Rocky Linux 8.8 (Green Obsidian)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:8:GA"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2029-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-8"
ROCKY_SUPPORT_PRODUCT_VERSION="8.8"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.8"

Engine OS:

NAME="CentOS Stream"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Stream 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugzilla.redhat.com/"
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_SUPPORT_PRODUCT_VERSION="CentOS Stream"
[root@vdi-manager etc]# rpm -qa | grep ansible
ansible-collection-ansible-posix-1.3.0-1.2.el8.noarch
ansible-core-2.15.0-1.el8.x86_64
ovirt-ansible-collection-3.1.2-1.el8.noarch
python38-ansible-runner-2.1.3-1.el8.noarch
ansible-collection-ansible-netcommon-2.2.0-3.2.el8.noarch
ansible-runner-2.1.3-1.el8.noarch
ansible-collection-ansible-utils-2.3.0-2.2.el8.noarch
ansible-collection-ansible-posix-1.3.0-1.2.el8.noarch
ansible-core-2.15.0-1.el8.x86_64
ovirt-ansible-collection-3.1.2-1.el8.noarch
python38-ansible-runner-2.1.3-1.el8.noarch
ansible-collection-ansible-netcommon-2.2.0-3.2.el8.noarch
ansible-runner-2.1.3-1.el8.noarch
ansible-collection-ansible-utils-2.3.0-2.2.el8.noarch
[root@vdi-manager etc]#  rpm -qa | grep ovirt
python39-ovirt-imageio-common-2.4.7-1.el8.x86_64
ovirt-engine-setup-plugin-ovirt-engine-common-4.5.4-1.el8.noarch
ovirt-engine-restapi-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-ovirt-engine-4.5.4-1.el8.noarch
ovirt-vmconsole-proxy-1.0.9-1.el8.noarch
ovirt-cockpit-sso-0.1.4-2.el8.noarch
ovirt-provider-ovn-1.2.36-1.el8.noarch
ovirt-engine-websocket-proxy-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-imageio-4.5.4-1.el8.noarch
ovirt-ansible-collection-3.1.2-1.el8.noarch
ovirt-engine-wildfly-overlay-24.0.1-1.el8.noarch
ovirt-openvswitch-ovn-common-2.15-4.el8.noarch
centos-release-ovirt45-8.9-1.el8s.noarch
python3-ovirt-setup-lib-1.3.3-1.el8.noarch
ovirt-imageio-common-2.5.0-1.el8.x86_64
python3-ovirt-engine-sdk4-4.6.2-1.el8.x86_64
ovirt-engine-extension-aaa-misc-1.1.1-1.el8.noarch
ovirt-python-openvswitch-2.15-4.el8.noarch
ovirt-engine-dwh-grafana-integration-setup-4.5.7-1.el8.noarch
ovirt-engine-tools-backup-4.5.4-1.el8.noarch
ovirt-engine-backend-4.5.4-1.el8.noarch
ovirt-engine-keycloak-setup-15.0.2-6.el8.noarch
ovirt-engine-vmconsole-proxy-helper-4.5.4-1.el8.noarch
ovirt-engine-extension-aaa-ldap-1.4.6-1.el8.noarch
python3.11-ovirt-imageio-common-2.5.0-1.el8.x86_64
python3.11-ovirt-imageio-client-2.5.0-1.el8.x86_64
ovirt-openvswitch-ovn-2.15-4.el8.noarch
ovirt-vmconsole-1.0.9-1.el8.noarch
ovirt-openvswitch-ovn-central-2.15-4.el8.noarch
ovirt-imageio-client-2.5.0-1.el8.x86_64
ovirt-web-ui-1.9.3-1.el8.noarch
ovirt-engine-extensions-api-1.0.1-1.el8.noarch
ovirt-engine-extension-aaa-jdbc-1.3.0-1.el8.noarch
ovirt-dependencies-4.5.2-1.el8.noarch
ovirt-engine-setup-base-4.5.4-1.el8.noarch
ovirt-engine-dwh-setup-4.5.7-1.el8.noarch
ovirt-engine-dbscripts-4.5.4-1.el8.noarch
ovirt-engine-setup-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-vmconsole-proxy-helper-4.5.4-1.el8.noarch
ovirt-engine-webadmin-portal-4.5.4-1.el8.noarch
python3.11-ovirt-engine-sdk4-4.6.2-1.el8.x86_64
ovirt-imageio-daemon-2.5.0-1.el8.x86_64
ovirt-engine-metrics-1.6.1-1.el8.noarch
python39-ovirt-engine-sdk4-4.6.0-1.el8.x86_64
ovirt-openvswitch-2.15-4.el8.noarch
ovirt-engine-setup-plugin-websocket-proxy-4.5.4-1.el8.noarch
ovirt-engine-setup-plugin-cinderlib-4.5.4-1.el8.noarch
ovirt-engine-4.5.4-1.el8.noarch
ovirt-engine-ui-extensions-1.3.7-1.el8.noarch
ovirt-engine-wildfly-24.0.1-1.el8.x86_64
python39-ovirt-imageio-client-2.4.7-1.el8.x86_64
python3-ovirt-engine-lib-4.5.4-1.el8.noarch
ovirt-engine-dwh-4.5.7-1.el8.noarch
ovirt-engine-keycloak-15.0.2-6.el8.noarch
ovirt-engine-tools-4.5.4-1.el8.noarch
ovirt-engine-extension-aaa-ldap-setup-1.4.6-1.el8.noarch
023-06-06 20:15:39 EDT - TASK [oVirt.metrics/roles/ovirt_initial_validations : If output plugin is elasticsearch, validate host address is set] ***
2023-06-06 20:15:39 EDT - {
  "uuid" : "7c488f3f-17c1-47b6-8e2c-446529a0bfa5",
  "counter" : 677,
  "stdout" : "fatal: [vdi-node1.dev.arc.gwu.edu]: FAILED! => {\"msg\": \"The conditional check 'elasticsearch_host == None or elasticsearch_host is undefined' failed. The error was: error while evaluating conditional (elasticsearch_host == None or elasticsearch_host is undefined): 'elasticsearch_host' is undefined. 'elasticsearch_host' is undefined\\n\\nThe error appears to be in '/usr/share/ansible/roles/oVirt.metrics/roles/ovirt_initial_validations/tasks/check_logging_collectors.yml': line 4, column 3, but may\\nbe elsewhere in the file depending on the exact syntax problem.\\n\\nThe offending line appears to be:\\n\\n\\n- name: If output plugin is elasticsearch, validate host address is set\\n  ^ here\\n\"}",
dacianstremtan commented 1 year ago

I was able to bypass the error like this:

When deploying the hosted-engine use the he_pause_host: hosted-engine --deploy --ansible-extra-vars=he_pause_host=true

When the deployment stopped then login the HostedEngineLocal using root via ssh or virsh console edit the following filevi /etc/ovirt-engine-metrics/config.yml.d/config.yml and the the lines:

collect_ovirt_vdsm_log: false
collect_ovirt_engine_log: false
collect_ovirt_collectd_metrics: false

Deploy host from the ovirt ui console The host deployment now is not stuck at the metrics.

Ecsi1337 commented 1 year ago

Currently, the only working solution is to have both the hosts and the hosted-engine version 4.4.10, and then update the hosted-engine to 4.5.4 first, and then the hosts. It is important that you cannot add a 4.4.10 host to the 4.5.4 hosted-engine, everything must be started from 4.4.10.

michalskrivanek commented 1 year ago

this was fixed by https://github.com/oVirt/ovirt-engine-metrics/pull/35 Make sure you're using the nightly

Kariton commented 1 year ago

i was able to get past this error by set a versionlock for the preinstalled ansible-core version e.g.: dnf versionlock ansible-core-0:2.13.5-1.el8.*

this does also prevent https://github.com/oVirt/ovirt-ansible-collection/issues/695 so that workaround does not need to be applied.

but good to know that nightly has that already fixed.

feel free to close this issue - or let it open for others. :)

damonzhengfd commented 1 year ago

i was able to get past this error by set a versionlock for the preinstalled ansible-core version e.g.: dnf versionlock ansible-core-0:2.13.5-1.el8.*

this does also prevent #695 so that workaround does not need to be applied.

but good to know that nightly has that already fixed.

feel free to close this issue - or let it open for others. :)

Try this and finally works !! Thanks

serg-ku commented 1 year ago

Downgrading to ansible 2.14 also works dnf downgrade ansible-core-2.14.2-3.el8.x86_64 looks like something is changed in ansible 2.15 also looks like it's already fixed in https://github.com/oVirt/ovirt-engine-metrics/commit/4bad83bb53289a1e39e98a13dba9cf1eb07d8024 but not released yet.

cgoudie commented 1 year ago

Also note, dnf versionlock ansible-core-0:2.13.5-1.el8.* needs to be run on the ovirt engine VM, not the host.

nerdalertdk commented 1 year ago

Why is this not released ? I'm having this problem in Okt 2023 and the bug was found in Jun