tsuna-server / build-server-ansible

1 stars 0 forks source link

Instructions will be hung up when discovering hosts. #122

Closed TsutomuNakamura closed 3 months ago

TsutomuNakamura commented 3 months ago

Related #121. Commit: e39eacb.

The instruction will be hanged up after a message below was displayed.

TASK [nova_discover_hosts : Run discover_hosts.sh to add compute nodes to a cell] ********
discover_hosts.sh compute01 compute02 compute03
TsutomuNakamura commented 3 months ago
MariaDB [nova_api]> select count(host) from host_mappings where host in ("dev-compute03");
+-------------+
| count(host) |
+-------------+
|           0 |
+-------------+
1 row in set (0.000 sec)
TsutomuNakamura commented 3 months ago

When run the command

su -s /bin/sh -c 'nova-manage cell_v2 discover_hosts --verbose' nova

In journalctl.

May 27 16:10:10 dev-controller01 su[12241]: (to nova) root on pts/1
May 27 16:10:10 dev-controller01 systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
░░ Subject: A start job for unit sysstat-collect.service has begun execution
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit sysstat-collect.service has begun execution.
░░
░░ The job identifier is 18857.
May 27 16:10:10 dev-controller01 su[12241]: pam_unix(su:session): session opened for user nova(uid=64060) by sushi7(uid=0)
May 27 16:10:10 dev-controller01 systemd[1]: sysstat-collect.service: Deactivated successfully.
░░ Subject: Unit succeeded
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ The unit sysstat-collect.service has successfully entered the 'dead' state.
May 27 16:10:10 dev-controller01 systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
░░ Subject: A start job for unit sysstat-collect.service has finished successfully
░░ Defined-By: systemd
░░ Support: http://www.ubuntu.com/support
░░
░░ A start job for unit sysstat-collect.service has finished successfully.
░░
░░ The job identifier is 18857.
May 27 16:10:11 dev-controller01 mariadbd[1292]: 2024-05-27 16:10:11 364 [Warning] Aborted connection 364 to db: 'nova' user: 'nova' host: 'dev-controller01' (Got an error reading communication packets)
May 27 16:10:11 dev-controller01 mariadbd[1292]: 2024-05-27 16:10:11 363 [Warning] Aborted connection 363 to db: 'nova_api' user: 'nova' host: 'dev-controller01' (Got an error reading communication packets)
May 27 16:10:11 dev-controller01 su[12241]: pam_unix(su:session): session closed for user nova
TsutomuNakamura commented 3 months ago

When running on ubuntu 22.04.

MariaDB [nova_api]> select * from host_mappings
    -> ;
+---------------------+------------+----+---------+-----------+
| created_at          | updated_at | id | cell_id | host      |
+---------------------+------------+----+---------+-----------+
| YYYY-MM-DD HH:MM:SS | NULL       |  1 |       2 | compute02 |
| YYYY-MM-DD HH:MM:SS | NULL       |  2 |       2 | compute03 |
| YYYY-MM-DD HH:MM:SS | NULL       |  3 |       2 | compute01 |
+---------------------+------------+----+---------+-----------+
3 rows in set (0.000 sec)
TsutomuNakamura commented 3 months ago
- import_playbook: commons.yml
- import_playbook: controllers.yml
- import_playbook: computes.yml
- import_playbook: swifts.yml
- import_playbook: cinders.yml
#- import_playbook: cephs.yml
#- import_playbook: controllers_discover_hosts.yml
#- import_playbook: controllers_create_example_instances.yml
#- import_playbook: verify.yml

MariaDB [nova_api]> select * from host_mappings;
Empty set (0.000 sec)
TsutomuNakamura commented 3 months ago
root@dev-controller01:~# su -s /bin/sh -c 'nova-manage cell_v2 discover_hosts --verbose' nova
Modules with known eventlet monkey patching issues were imported prior to eventlet monkey patching: urllib3. This warning can usually be ignored if the caller is only importing and not executing nova code.
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': 310cf837-b559-4f6f-9c06-cd155b79c15f
Checking host mapping for compute host 'dev-compute02': 8b118a23-129e-438e-b782-dc3011f3e43e
Creating host mapping for compute host 'dev-compute02': 8b118a23-129e-438e-b782-dc3011f3e43e
Checking host mapping for compute host 'dev-compute03': 300d17c5-e4e1-45d6-8934-eaf5c2e62292
Creating host mapping for compute host 'dev-compute03': 300d17c5-e4e1-45d6-8934-eaf5c2e62292
Checking host mapping for compute host 'dev-compute01': 8e6f9947-97f9-4ecd-8de0-fac15c042481
Creating host mapping for compute host 'dev-compute01': 8e6f9947-97f9-4ecd-8de0-fac15c042481
Found 3 unmapped computes in cell: 310cf837-b559-4f6f-9c06-cd155b79c15f
 ...
MariaDB [nova_api]> select * from host_mappings;
+---------------------+------------+----+---------+---------------+
| created_at          | updated_at | id | cell_id | host          |
+---------------------+------------+----+---------+---------------+
| 2024-05-29 15:49:41 | NULL       |  1 |       2 | dev-compute02 |
| 2024-05-29 15:49:41 | NULL       |  2 |       2 | dev-compute03 |
| 2024-05-29 15:49:41 | NULL       |  3 |       2 | dev-compute01 |
+---------------------+------------+----+---------+---------------+
3 rows in set (0.001 sec)
TsutomuNakamura commented 3 months ago

Are the list that will be registered host_mappings?

root@dev-controller01:~# openstack compute service list --service nova-compute
+--------------------------------------+--------------+---------------+------+---------+-------+----------------------------+
| ID                                   | Binary       | Host          | Zone | Status  | State | Updated At                 |
+--------------------------------------+--------------+---------------+------+---------+-------+----------------------------+
| f8c0499a-0481-4edd-985e-cee511273b00 | nova-compute | dev-compute02 | nova | enabled | up    | 2024-05-29T16:07:43.000000 |
| e3eea433-ed35-4c54-95ed-11b4500878cc | nova-compute | dev-compute03 | nova | enabled | up    | 2024-05-29T16:07:43.000000 |
| d3198ed1-9a97-45fb-8a1d-93ab1cebd88e | nova-compute | dev-compute01 | nova | enabled | up    | 2024-05-29T16:07:43.000000 |
+--------------------------------------+--------------+---------------+------+---------+-------+----------------------------+
TsutomuNakamura commented 3 months ago
controller01:~# su -s /bin/sh -c "nova-manage cell_v2 list_cells" nova
+-------+--------------------------------------+--------------------------------------------+-------------------------------------------------------+----------+
|  Name |                 UUID                 |              Transport URL                 |                  Database Connection                  | Disabled |
+-------+--------------------------------------+--------------------------------------------+-------------------------------------------------------+----------+
| cell0 | 00000000-0000-0000-0000-000000000000 |                     none:/                 | mysql+pymysql://nova:****@controller01/nova_cell0     |  False   |
| cell1 | 310cf837-b559-4f6f-9c06-cd155b79c15f | rabbit://openstack:****@controller01:5672/ |    mysql+pymysql://nova:****@controller01/nova        |  False   |
+-------+--------------------------------------+--------------------------------------------+-------------------------------------------------------+----------+
TsutomuNakamura commented 3 months ago

In ubuntu 24.04, nova-scheduler and nova-novncproxy have been failed.

root@dev-controller01:/var/log# systemctl status nova-scheduler
× nova-scheduler.service - OpenStack Compute Scheduler
     Loaded: loaded (/usr/lib/systemd/system/nova-scheduler.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2024-05-30 15:50:35 UTC; 22min ago
   Duration: 1.549s
       Docs: man:nova-scheduler(1)
    Process: 5630 ExecStart=/etc/init.d/nova-scheduler systemd-start (code=exited, status=1/FAILURE)
   Main PID: 5630 (code=exited, status=1/FAILURE)
        CPU: 943ms

May 30 15:50:34 dev-controller01 systemd[1]: nova-scheduler.service: Main process exited, code=exited, status=1/FAILURE
May 30 15:50:34 dev-controller01 systemd[1]: nova-scheduler.service: Failed with result 'exit-code'.
May 30 15:50:35 dev-controller01 systemd[1]: nova-scheduler.service: Scheduled restart job, restart counter is at 7.
May 30 15:50:35 dev-controller01 systemd[1]: nova-scheduler.service: Start request repeated too quickly.
May 30 15:50:35 dev-controller01 systemd[1]: nova-scheduler.service: Failed with result 'exit-code'.
May 30 15:50:35 dev-controller01 systemd[1]: Failed to start nova-scheduler.service - OpenStack Compute Scheduler.
root@dev-controller01:/var/log# systemctl status nova-conductor
× nova-conductor.service - OpenStack Compute Conductor
     Loaded: loaded (/usr/lib/systemd/system/nova-conductor.service; enabled; preset: enabled)
     Active: failed (Result: exit-code) since Thu 2024-05-30 15:50:34 UTC; 23min ago
   Duration: 1.437s
       Docs: man:nova-conductor(1)
    Process: 5610 ExecStart=/etc/init.d/nova-conductor systemd-start (code=exited, status=1/FAILURE)
   Main PID: 5610 (code=exited, status=1/FAILURE)
        CPU: 915ms

May 30 15:50:34 dev-controller01 systemd[1]: nova-conductor.service: Scheduled restart job, restart counter is at 7.
May 30 15:50:34 dev-controller01 systemd[1]: nova-conductor.service: Start request repeated too quickly.
May 30 15:50:34 dev-controller01 systemd[1]: nova-conductor.service: Failed with result 'exit-code'.
May 30 15:50:34 dev-controller01 systemd[1]: Failed to start nova-conductor.service - OpenStack Compute Conductor.
TsutomuNakamura commented 3 months ago

Failed to launch placement api.

2024-05-30 16:20:12.036 11696 INFO oslo_service.periodic_task [-] Skipping periodic task _discover_hosts_in_cells because its interval is negative
2024-05-30 16:20:12.283 11696 WARNING keystoneauth.discover [None req-6f73cc5c-8984-4bca-b9e8-128a5b020fde - - - - - -] Failed to contact the endpoint at http://dev-controller01:8778 for discovery. Fallback to using that endpoint as the base url.
2024-05-30 16:20:12.284 11696 WARNING keystoneauth.discover [None req-6f73cc5c-8984-4bca-b9e8-128a5b020fde - - - - - -] Failed to contact the endpoint at http://dev-controller01:8778 for discovery. Fallback to using that endpoint as the base url.
2024-05-30 16:20:12.285 11696 ERROR nova.scheduler.client.report [None req-6f73cc5c-8984-4bca-b9e8-128a5b020fde - - - - - -] Failed to initialize placement client (is keystone available?): openstack.exceptions.NotSupported: The placement service for dev-controller01:RegionOne exists but does not have any supported versions.
2024-05-30 16:20:12.285 11696 ERROR nova.scheduler.manager [None req-6f73cc5c-8984-4bca-b9e8-128a5b020fde - - - - - -] Fatal error initializing placement client: The placement service for dev-controller01:RegionOne exists but does not have any supported versions.: openstack.exceptions.NotSupported: The placement service for dev-controller01:RegionOne exists but does not have any supported versions.
2024-05-30 16:20:12.285 11696 CRITICAL nova [None req-6f73cc5c-8984-4bca-b9e8-128a5b020fde - - - - - -] Unhandled error: openstack.exceptions.NotSupported: The placement service for dev-controller01:RegionOne exists but does not have any supported versions.
2024-05-30 16:20:12.285 11696 ERROR nova Traceback (most recent call last):
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/bin/nova-scheduler", line 10, in <module>
2024-05-30 16:20:12.285 11696 ERROR nova     sys.exit(main())
2024-05-30 16:20:12.285 11696 ERROR nova              ^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/cmd/scheduler.py", line 47, in main
2024-05-30 16:20:12.285 11696 ERROR nova     server = service.Service.create(
2024-05-30 16:20:12.285 11696 ERROR nova              ^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/service.py", line 252, in create
2024-05-30 16:20:12.285 11696 ERROR nova     service_obj = cls(host, binary, topic, manager,
2024-05-30 16:20:12.285 11696 ERROR nova                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/service.py", line 116, in __init__
2024-05-30 16:20:12.285 11696 ERROR nova     self.manager = manager_class(host=self.host, *args, **kwargs)
2024-05-30 16:20:12.285 11696 ERROR nova                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py", line 75, in __init__
2024-05-30 16:20:12.285 11696 ERROR nova     self.placement_client
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/scheduler/manager.py", line 105, in placement_client
2024-05-30 16:20:12.285 11696 ERROR nova     return report.report_client_singleton()
2024-05-30 16:20:12.285 11696 ERROR nova            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 91, in report_client_singleton
2024-05-30 16:20:12.285 11696 ERROR nova     PLACEMENTCLIENT = SchedulerReportClient()
2024-05-30 16:20:12.285 11696 ERROR nova                       ^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 234, in __init__
2024-05-30 16:20:12.285 11696 ERROR nova     self._client = self._create_client()
2024-05-30 16:20:12.285 11696 ERROR nova                    ^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 277, in _create_client
2024-05-30 16:20:12.285 11696 ERROR nova     client = self._adapter or utils.get_sdk_adapter('placement')
2024-05-30 16:20:12.285 11696 ERROR nova                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/nova/utils.py", line 995, in get_sdk_adapter
2024-05-30 16:20:12.285 11696 ERROR nova     return getattr(conn, service_type)
2024-05-30 16:20:12.285 11696 ERROR nova            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 89, in __get__
2024-05-30 16:20:12.285 11696 ERROR nova     proxy = self._make_proxy(instance)
2024-05-30 16:20:12.285 11696 ERROR nova             ^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-05-30 16:20:12.285 11696 ERROR nova   File "/usr/lib/python3/dist-packages/openstack/service_description.py", line 293, in _make_proxy
2024-05-30 16:20:12.285 11696 ERROR nova     raise exceptions.NotSupported(
2024-05-30 16:20:12.285 11696 ERROR nova openstack.exceptions.NotSupported: The placement service for dev-controller01:RegionOne exists but does not have any supported versions.
2024-05-30 16:20:12.285 11696 ERROR nova
TsutomuNakamura commented 3 months ago

https://storyboard.openstack.org/#!/story/2009315

TsutomuNakamura commented 3 months ago

/usr/lib/python3/dist-packages/openstack/service_description.py

        found_version = temp_adapter.get_api_major_version()
        if found_version is None:
            print("instance.config.get_region_name(self.service_type)")
            region_name = instance.config.get_region_name(self.service_type)
            print(version_kwargs)
            if version_kwargs:
                raise exceptions.NotSupported(
                    "The {service_type} service for {cloud}:{region_name}"
                    " exists but does not have any supported versions.".format(
                        service_type=self.service_type,
                        cloud=instance.name,
                        region_name=region_name,
                    )
                )
            else:
./adapter.py:    def get_api_major_version(self, auth=None, **kwargs):
./plugin.py:    def get_api_major_version(self, session, endpoint_override=None, **kwargs):
./session.py:    def get_api_major_version(self, auth=None, **kwargs):
./identity/base.py:    def get_api_major_version(self, session, service_type=None, interface=None,

The result of it is like below.

root@dev-controller01:~# openstack identity provider list
# base.py->get_endpoint_data(): return endpoint_data.get_version_data() "EndpointData{api_version=(3, 0), catalog_url=http://dev-controller01:5000/v3/, endpoint_id=d00e3192c94f4eb78f9806659ab6feca, interface=public, major_version=None, max_microversion=None, min_microversion=None, next_min_version=None, not_before=None, raw_endpoint={'id': 'd00e3192c94f4eb78f9806659ab6feca', 'interface': 'public', 'region_id': 'RegionOne', 'url': 'http://dev-controller01:5000/v3/', 'region': 'RegionOne'}, region_name=RegionOne, service_id=2799b3eda81a4866a35e2aff3e6818dc, service_name=keystone, service_type=identity, service_url=None, url=http://dev-controller01:5000/v3/}" #############################################

The result of major_version=None cause the problem failing to start nova-scheduler. A compatibility was lost between nova and keystone?

TsutomuNakamura commented 3 months ago

Cases

dpkg --list | grep placement

... ii python3-osc-placement 4.3.0-0ubuntu1~cloud0 all OpenStackClient plugin for the Placement service - Python 3.x


* Ubuntu 22.04 and placement installed by Apt -> NG

pip list | grep placement

openstack-placement 11.0.0 osc-placement 4.3.0

dpkg --list | grep placement

...... ii placement-api 1:11.0.0-0ubuntu1~cloud0 all OpenStack Placement - API ii placement-common 1:11.0.0-0ubuntu1~cloud0 all OpenStack Placement - common files ii python3-osc-placement 4.3.0-0ubuntu1~cloud0 all OpenStackClient plugin for the Placement service - Python 3.x ii python3-placement 1:11.0.0-0ubuntu1~cloud0 all OpenStack Placement - Python 3 libraries