oVirt / ovirt-ansible-collection

Ansible collection with official oVirt modules and roles
72 stars 89 forks source link

Fails to add/deploy new host on non default cluster. Failing when configuring OVN for oVirt #743

Open SkullKill opened 6 months ago

SkullKill commented 6 months ago
SUMMARY

Unable to add add/deploy new virtual host when not using the default cluster. Failing when configuring OVN for oVirt part.

looks like other people had the same issue https://www.mail-archive.com/users@ovirt.org/msg73109.html

might be related to this too? https://www.mail-archive.com/users@ovirt.org/msg72728.html

bug was introduced when /usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/tasks/configure.yml was modified here https://github.com/oVirt/ovirt-ansible-collection/issues/695

COMPONENT NAME

Software Version:4.5.6-1.el8

STEPS TO REPRODUCE
Compute > Hosts > Add , select a Host cluster that is not the Default
or click on failed installed, and reinstall
EXPECTED RESULTS

host is added into oVirt

ACTUAL RESULTS

2024-03-16 10:56:19 AWST - {
  "uuid" : "b7deae43-0a0c-4677-99e8-67605531ec3e",
  "counter" : 412,
  "stdout" : "fatal: [vh-1.home.skaccess.com]: FAILED! => {\"changed\": true, \"cmd\": [\"vdsm-tool\", \"ovn-config\", \"192.168.33.31\", \"vh-1.home.skaccess.com\"], \"delta\": \"0:00:02.261308\", \"end\": \"2024-03-16 10:56:16.457356\", \"msg\": \"non-zero return code\", \"rc\": 1, \"start\": \"2024-03-16 10:56:14.196048\", \"stderr\": \"Traceback (most recent call last):\\n  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 117, in get_network\\n    return networks[net_name]\\nKeyError: 'vh-1.home.skaccess.com'\\n\\nDuring handling of the above exception, another exception occurred:\\n\\nTraceback (most recent call last):\\n  File \\\"/usr/bin/vdsm-tool\\\", line 195, in main\\n    return tool_command[cmd][\\\"command\\\"](*args)\\n  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 63, in ovn_config\\n    ip_address = get_ip_addr(get_network(network_caps(), net_name))\\n  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 119, in get_network\\n    raise NetworkNotFoundError(net_name)\\nvdsm.tool.ovn_config.NetworkNotFoundError: vh-1.home.skaccess.com\", \"stderr_lines\": [\"Traceback (most recent call last):\", \"  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 117, in get_network\", \"    return networks[net_name]\", \"KeyError: 'vh-1.home.skaccess.com'\", \"\", \"During handling of the above exception, another exception occurred:\", \"\", \"Traceback (most recent call last):\", \"  File \\\"/usr/bin/vdsm-tool\\\", line 195, in main\", \"    return tool_command[cmd][\\\"command\\\"](*args)\", \"  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 63, in ovn_config\", \"    ip_address = get_ip_addr(get_network(network_caps(), net_name))\", \"  File \\\"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\\\", line 119, in get_network\", \"    raise NetworkNotFoundError(net_name)\", \"vdsm.tool.ovn_config.NetworkNotFoundError: vh-1.home.skaccess.com\"], \"stdout\": \"\", \"stdout_lines\": []}",
  "start_line" : 415,
  "end_line" : 416,
  "runner_ident" : "89771868-e051-4efe-8aae-2644aa78422d",
  "event" : "runner_on_failed",
  "pid" : 748005,
  "created" : "2024-03-16T02:56:16.486631",
  "parent_uuid" : "00163e6f-61a9-892c-5e96-000000000042",
  "event_data" : {
    "playbook" : "ovirt-host-deploy.yml",
    "playbook_uuid" : "4641a037-5cc0-4a22-9149-002a6254a050",
    "play" : "all",
    "play_uuid" : "00163e6f-61a9-892c-5e96-000000000002",
    "play_pattern" : "all",
    "task" : "Configure OVN for oVirt",
    "task_uuid" : "00163e6f-61a9-892c-5e96-000000000042",
    "task_action" : "ansible.builtin.command",
    "task_args" : "",
    "task_path" : "/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/tasks/configure.yml:43",
    "role" : "ovirt-provider-ovn-driver",
    "host" : "vh-1.home.skaccess.com",
    "remote_addr" : "vh-1.home.skaccess.com",
    "res" : {
      "changed" : true,
      "stdout" : "",
      "stderr" : "Traceback (most recent call last):\n  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 117, in get_network\n    return networks[net_name]\nKeyError: 'vh-1.home.skaccess.com'\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n  File \"/usr/bin/vdsm-tool\", line 195, in main\n    return tool_command[cmd][\"command\"](*args)\n  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 63, in ovn_config\n    ip_address = get_ip_addr(get_network(network_caps(), net_name))\n  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 119, in get_network\n    raise NetworkNotFoundError(net_name)\nvdsm.tool.ovn_config.NetworkNotFoundError: vh-1.home.skaccess.com",
      "rc" : 1,
      "cmd" : [ "vdsm-tool", "ovn-config", "192.168.33.31", "vh-1.home.skaccess.com" ],
      "start" : "2024-03-16 10:56:14.196048",
      "end" : "2024-03-16 10:56:16.457356",
      "delta" : "0:00:02.261308",
      "msg" : "non-zero return code",
      "invocation" : {
        "module_args" : {
          "_raw_params" : "vdsm-tool ovn-config  192.168.33.31 vh-1.home.skaccess.com\n",
          "_uses_shell" : false,
          "expand_argument_vars" : true,
          "stdin_add_newline" : true,
          "strip_empty_ends" : true,
          "argv" : null,
          "chdir" : null,
          "executable" : null,
          "creates" : null,
          "removes" : null,
          "stdin" : null
        }
      },
      "stdout_lines" : [ ],
      "stderr_lines" : [ "Traceback (most recent call last):", "  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 117, in get_network", "    return networks[net_name]", "KeyError: 'vh-1.home.skaccess.com'", "", "During handling of the above exception, another exception occurred:", "", "Traceback (most recent call last):", "  File \"/usr/bin/vdsm-tool\", line 195, in main", "    return tool_command[cmd][\"command\"](*args)", "  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 63, in ovn_config", "    ip_address = get_ip_addr(get_network(network_caps(), net_name))", "  File \"/usr/lib/python3.9/site-packages/vdsm/tool/ovn_config.py\", line 119, in get_network", "    raise NetworkNotFoundError(net_name)", "vdsm.tool.ovn_config.NetworkNotFoundError: vh-1.home.skaccess.com" ],
      "_ansible_no_log" : false
    },
    "start" : "2024-03-16T02:56:13.918752",
    "end" : "2024-03-16T02:56:16.485248",
    "duration" : 2.566496,
    "ignore_errors" : null,
    "event_loop" : null,
    "uuid" : "b7deae43-0a0c-4677-99e8-67605531ec3e"
  }
}
workaround

change /usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/tasks/configure.yml to be same as on ovirt-engine 4.5.4. (on the manager)

.
.
  when:
    - cluster_switch == "ovs" or (ovn_central is defined and ovn_central | ipaddr)
.
.
  when:
    - ovn_central is defined
    - ovn_central | ipaddr 

you will then get the same error as reported in https://github.com/oVirt/ovirt-ansible-collection/issues/695

"The conditional check 'cluster_switch == \"ovs\" or (ovn_central is defined | ipaddr)' failed. The error was: The ipaddr filter requires python's netaddr be installed on the ansible controller\n\nThe error appears to be in '/usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/tasks/configure.yml': line 3, column 7, but may\nbe elsewhere in the file depending on the exact syntax problem.\n\nThe offending line appears to be:\n\n- block:\n    - name: Install ovs\n      ^ here\n",

to fix this , find out which version of python ansible is using

[root@vmmg-1 ~]# ansible --version
ansible [core 2.16.3]
  config file = /etc/ansible/ansible.cfg
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python3.12/site-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/bin/ansible
  python version = 3.12.1 (main, Feb 21 2024, 14:18:26) [GCC 8.5.0 20210514 (Red Hat 8.5.0-21)] (/usr/bin/python3.12)
  jinja version = 3.1.2
  libyaml = True

in my case, pyton3.12, then install netaddr for it

dnf install python3.12-pip.noarch
python3.12 -m pip  install netaddr

after that, adding new host in non default cluster works fine.

suggested fix, would be to revert the ovn configure.yml to what it was before, and auto install the netaddr ? or figure out why it is failing to add on non default cluster and fix.

cvinh commented 3 months ago

Hello In my case, I have no OVN on my engine, but the OVN stuff got installed on the hosts, failing as well with same issue. I hardly found a workaround : Setting ovn_state: "unconfigured" in engine /usr/share/ovirt-engine/ansible-runner-service-project/project/roles/ovirt-provider-ovn-driver/defaults/main.yml