hitachienergy / epiphany

Cloud and on-premises automation for Kubernetes centered industrial grade solutions.
Apache License 2.0
138 stars 107 forks source link

[BUG] RHEL 7 OS upgrade to RHEL 8 fails asserting failed repmgr10 service #3180

Closed przemyslavic closed 2 years ago

przemyslavic commented 2 years ago

Describe the bug Upgrading RHEL 7 to RHEL 8 by running https://github.com/epiphany-platform/epiphany/blob/develop/ci/ansible/playbooks/os/rhel/upgrade-release.yml playbook fails with the following error:

2022-06-01T15:53:08.2063147Z TASK [Assert there are no failed services] *************************************
2022-06-01T15:53:08.2849315Z ok: [ci-10todevazrhflannel-kubernetes-master-vm-0]
2022-06-01T15:53:08.3789068Z ok: [ci-10todevazrhflannel-kubernetes-node-vm-0]
2022-06-01T15:53:08.4308066Z ok: [ci-10todevazrhflannel-kubernetes-node-vm-1]
2022-06-01T15:53:08.4857524Z ok: [ci-10todevazrhflannel-logging-vm-0]
2022-06-01T15:53:08.5299034Z ok: [ci-10todevazrhflannel-logging-vm-1]
2022-06-01T15:53:08.5799759Z ok: [ci-10todevazrhflannel-logging-vm-2]
2022-06-01T15:53:08.9551635Z ok: [ci-10todevazrhflannel-monitoring-vm-0]
2022-06-01T15:53:09.0075325Z ok: [ci-10todevazrhflannel-kafka-vm-0]
2022-06-01T15:53:09.0636137Z ok: [ci-10todevazrhflannel-kafka-vm-1]
2022-06-01T15:53:09.1089378Z ok: [ci-10todevazrhflannel-kafka-vm-2]
2022-06-01T15:53:09.1601524Z ok: [ci-10todevazrhflannel-postgresql-vm-0]
2022-06-01T15:53:09.2121892Z fatal: [ci-10todevazrhflannel-postgresql-vm-1]: FAILED! => {"assertion": "failed_services | reject('match', 'user@') | count == 0", "changed": false, "evaluated_to": false, "msg": "Failed service(s) found: repmgr10.service"}
2022-06-01T15:53:09.2622194Z ok: [ci-10todevazrhflannel-load-balancer-vm-0]
2022-06-01T15:53:09.3108538Z ok: [ci-10todevazrhflannel-rabbitmq-vm-0]
2022-06-01T15:53:09.3679203Z ok: [ci-10todevazrhflannel-rabbitmq-vm-1]
2022-06-01T15:53:09.4174457Z ok: [ci-10todevazrhflannel-opendistro-for-elasticsearch-vm-0]
2022-06-01T15:53:09.4178900Z ok: [ci-10todevazrhflannel-opendistro-for-elasticsearch-vm-1]
2022-06-01T15:53:09.4534825Z ok: [ci-10todevazrhflannel-repository-vm-0]

How to reproduce Steps to reproduce the behavior:

  1. Deploy v1.0.4 cluster with 2 PostgreSQL nodes
  2. Run OS upgrade by executing the upgrade playbook

Expected behavior OS upgrade should succeed.

Config files

Environment

epicli version: [2.0.1 dev]

Additional context Add any other context about the problem here.


DoD checklist