redhat-openstack / nfv-tempest-plugin

This project is a plugin to OpenStack's Tempest used to test NFV usecases.
Apache License 2.0
7 stars 6 forks source link

[Tooling] create_and_set_aggregate self.aggregates_client.add_host is Failing on TripleO Train #2

Open Yarboa opened 3 years ago

Yarboa commented 3 years ago

CI of Director is failing 16.2 On the following tests: nfv_tempest_plugin.tests.scenario.test_nfv_advanced_usecases.TestAdvancedScenarios

Are failing on hypervisors suffix when calling aggregations packages/nfv_tempest_plugin/tests/scenario/baremetal_manager.py", line 269, in create_and_set_aggregate self.aggregates_client.add_host(aggr['aggregate']['id'], host=host) File "/home/stack/tempest/openstack-tempest/tempest/lib/services/compute/aggregates_client.py", line 107, in add_host post_body) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 300, in post return self.request('POST', url, extra_headers, headers, body, chunked) File "/home/stack/tempest/openstack-tempest/tempest/lib/services/compute/base_compute_client.py", line 48, in request method, url, extra_headers, headers, body, chunked) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 704, in request self._error_checker(resp, resp_body) File "/home/stack/tempest/openstack-tempest/tempest/lib/common/rest_client.py", line 810, in _error_checker raise exceptions.NotFound(resp_body, resp=resp) tempest.lib.exceptions.NotFound: Object not found Details: {'code': 404, 'message': 'Compute host computeovsdpdksriov-1.novalocal could not be found.'}

eshulman2 commented 3 years ago

We can just override the "dhcp_domain" parameter in our deployment with an empty string. we can also alternativly just create a parsing functions the looks like

def parsing(full_name):
    return full_name.split('.')[0]
MaxBab commented 3 years ago

It's an Openstack bug related to the naming of the hypervisor hosts. https://bugzilla.redhat.com/show_bug.cgi?id=1949385

SeanMooney commented 3 years ago

that is incorrect

its not a compute/nova bug you have not configured ooo correctly. novalocal is the default dns domain used by nova, ooo uses localdomain as the default cloud domain.

you need to make bot aling both.

https://opendev.org/openstack/tripleo-heat-templates/src/commit/d58efb58e0c39b2ca1585d87fe6d542484b33ad0/overcloud.j2.yaml#L139-L144

i dont think this is an openstack bug and its definetly not an nova bug but it might be a ooo one.

MaxBab commented 3 years ago

Hi @SeanMooney

We never configured the "CloudDomain" parameter in our deployments so it always taken from the defaults. If it is taken from the default, it should be configured with the same default value across all the deployed environment.

As I mentioned in the BZ - https://bugzilla.redhat.com/show_bug.cgi?id=1949385 The output of the "openstack hypervisor list" and controller nova-scheduler logs differs. The suffix of one is "novalocal" and "localdomain" for the second.

We are using exact the same tht for the deployment for 16.1 and 16.2. And that issue popped up in 16.2. In 16.1 everything works.

So, I still think, it's a bug.

SeanMooney commented 3 years ago

yep i just responed to the bz

openstack hypervisor list is not the correct command to use you should be using "openstack compute service list --service nova-compute" to get the host to add.

normally ooo configures the hostname on the over cloud hosts to be the same as the value it puts in the nova.conf hosts fileld however that behavior has been broken a few times. i would guess that way the /etc/hosts file and /etc/hostname file is being generated on the overcloud host has changed in some way.

in any case this si a porblem in the deployment tooling and the api that is being used not in nova.

MaxBab commented 3 years ago

@SeanMooney

Ok, so you are saying the the concept of taking the compute host details is incorrect and should be taken from the "compute service list"

Let's continue the discussion within the bz to not create duplicates.

Thanks.

SeanMooney commented 3 years ago

actully i think there are also duplicat bzs https://bugzilla.redhat.com/show_bug.cgi?id=1949385

i need to look at that one to but sure lets move this to bugzilla.

it really does look like a regresson of the compute hostname again. this has been broken before...

Yarboa commented 3 years ago

Thanks @SeanMooney we certainly need to do that