kubernetes / cloud-provider-openstack

Apache License 2.0
623 stars 611 forks source link

[octavia-ingress-controller] Could not retrieve certificate when delete TLS Octavia Ingress #1721

Closed utianayuba closed 2 years ago

utianayuba commented 2 years ago

/kind bug

What happened:

What you expected to happen: OpenStack LB deleted automatically right after TLS Octavia ingress on Kube is deleted

How to reproduce it:

[karno@nakula ~]$ kubectl get ing -w
NAME                       CLASS    HOSTS            ADDRESS   PORTS     AGE
test-octavia-ingress-tls   <none>   web.stratus.ok   10.14.14.129   80, 443   65s
[karno@nakula ~]$ openstack secret list
+----------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+---------------------------+--------+-----------------------------------------+-----------+------------+-------------+------+------------+
| Secret href                                                                      | Name                                                                                          | Created                   | Status | Content types                           | Algorithm | Bit length | Secret type | Mode | Expiration |
+----------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+---------------------------+--------+-----------------------------------------+-----------+------------+-------------+------+------------+
| https://external.stratus.ok:9311/v1/secrets/fd21d31d-bd98-4562-a11d-e9c9879f483c | kube_ingress_1ce014a5-3d5d-4ef2-909b-8589650d84a9_default_test-octavia-ingress-tls_tls-secret | 2021-12-25T06:28:08+00:00 | ACTIVE | {'default': 'application/octet-stream'} | aes       |        256 | opaque      | cbc  | None       |
+----------------------------------------------------------------------------------+-----------------------------------------------------------------------------------------------+---------------------------+--------+-----------------------------------------+-----------+------------+-------------+------+------------+
[karno@nakula ~]$ openstack loadbalancer list
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
| id                                   | name                                                                               | project_id                       | vip_address | provisioning_status | operating_status | provider |
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
| 03cee4dd-8f42-4647-bd76-b432609eb02f | kube_ingress_1ce014a5-3d5d-4ef2-909b-8589650d84a9_default_test-octavia-ingress-tls | c91d83e03bb74b7a99597de41b6ec417 | 10.0.0.157  | ACTIVE              | ONLINE           | amphora  |
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
[karno@nakula ~]$ echo "$IP web.stratus.ok" | sudo tee -a /etc/hosts
10.14.14.129 web.stratus.ok
[karno@nakula ~]$ curl https://web.stratus.ok
default backend - 404
[karno@nakula ~]$ curl https://web.stratus.ok/ping
webserver-69b47b55f5-c5w67
[karno@nakula ~]$ kubectl delete ing test-octavia-ingress-tls
ingress.networking.k8s.io "test-octavia-ingress-tls" deleted
[karno@nakula ~]$ kubectl get ing
No resources found in default namespace.
[karno@nakula ~]$ openstack loadbalancer list
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
| id                                   | name                                                                               | project_id                       | vip_address | provisioning_status | operating_status | provider |
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
| 03cee4dd-8f42-4647-bd76-b432609eb02f | kube_ingress_1ce014a5-3d5d-4ef2-909b-8589650d84a9_default_test-octavia-ingress-tls | c91d83e03bb74b7a99597de41b6ec417 | 10.0.0.157  | PENDING_DELETE      | ONLINE           | amphora  |
+--------------------------------------+------------------------------------------------------------------------------------+----------------------------------+-------------+---------------------+------------------+----------+
[karno@nakula ~]$ openstack secret list

[karno@nakula ~]$ 

Anything else we need to know?: Error messages on /var/log/kolla/octavia/octavia-worker.log

2021-12-25 13:30:16.871 37 ERROR barbicanclient.client [req-e2bc0527-6afc-4017-b315-3e22860907c3 - c91d83e03bb74b7a99597de41b6ec417 - - -] 4xx Client error: Not Found: Secrets container not found.: barbicanclient.exceptions.HTTPClientError: Not Found: Secret not found.
2021-12-25 13:30:16.872 37 ERROR octavia.certificates.manager.barbican_legacy [req-e2bc0527-6afc-4017-b315-3e22860907c3 - c91d83e03bb74b7a99597de41b6ec417 - - -] Error getting cert https://external.stratus.ok:9311/v1/secrets/fd21d31d-bd98-4562-a11d-e9c9879f483c: Not Found: Secrets container not found.: barbicanclient.exceptions.HTTPClientError: Not Found: Secrets container not found.
2021-12-25 13:30:16.877 37 ERROR oslo_messaging.rpc.server [req-e2bc0527-6afc-4017-b315-3e22860907c3 - c91d83e03bb74b7a99597de41b6ec417 - - -] Exception during message handling: octavia.common.exceptions.CertificateRetrievalException: Could not retrieve certificate: https://external.stratus.ok:9311/v1/secrets/fd21d31d-bd98-4562-a11d-e9c9879f483c

It looks like the OpenStack secret was deleted first before deleting the lb.

Environment:

utianayuba commented 2 years ago

This issue is not happening on OpenStack Wallaby:

Any suggestion?

jichenjc commented 2 years ago

I am not sure it's caused by W/X if so, it might be openstack Octavia and Barbican issue?

and looks like we are having this logic:

if c.osClient.Barbican != nil && ing.Spec.TLS != nil {
                nameFilter := fmt.Sprintf("kube_ingress_%s_%s_%s", c.config.ClusterName, ing.Namespace, ing.Name)
                if err := openstackutil.DeleteSecrets(c.osClient.Barbican, nameFilter); err != nil {
                        return fmt.Errorf("failed to remove Barbican secrets: %v", err)
                }

                logger.Info("Barbican secrets deleted")
        }

so from your log in octvia worker looks like ovtavia also try to delete the secret? I guess we need check the logic of LB and secret in CPO and openstack to figure out who should be the right one to delete and whether 2nd need ignore 404 Not found error..

utianayuba commented 2 years ago

I guess Octavia worker fails to delete LB because Barbican no longer has the needed secret.

jichenjc commented 2 years ago

I guess Octavia worker fails to delete LB because Barbican no longer has the needed secret.

if so which means if return fmt.Errorf("failed to remove Barbican secrets: %v", err) is in the log then we should add a tolerance to check whether it's NOTFOUND issue then continue to delete action the original issue report seems doesn't have that? anyway , this might be an enhancement point

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

k8s-triage-robot commented 2 years ago

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

You can:

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

k8s-ci-robot commented 2 years ago

@k8s-triage-robot: Closing this issue.

In response to [this](https://github.com/kubernetes/cloud-provider-openstack/issues/1721#issuecomment-1150624991): >The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. > >This bot triages issues and PRs according to the following rules: >- After 90d of inactivity, `lifecycle/stale` is applied >- After 30d of inactivity since `lifecycle/stale` was applied, `lifecycle/rotten` is applied >- After 30d of inactivity since `lifecycle/rotten` was applied, the issue is closed > >You can: >- Reopen this issue or PR with `/reopen` >- Mark this issue or PR as fresh with `/remove-lifecycle rotten` >- Offer to help out with [Issue Triage][1] > >Please send feedback to sig-contributor-experience at [kubernetes/community](https://github.com/kubernetes/community). > >/close > >[1]: https://www.kubernetes.dev/docs/guide/issue-triage/ Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes/test-infra](https://github.com/kubernetes/test-infra/issues/new?title=Prow%20issue:) repository.
utianayuba commented 1 year ago

still happening on octavia_ingress_controller_tag=v1.24.6