F5Networks / k8s-bigip-ctlr

Repository for F5 Container Ingress Services for Kubernetes & OpenShift.
Apache License 2.0
359 stars 195 forks source link

multicluster preview: on scale-down to zero irule still selects cluster with 0 pool members #3056

Closed alonsocamaro closed 1 year ago

alonsocamaro commented 1 year ago

Setup Details

Build: quay.io/f5networks/k8s-bigip-ctlr-devel:6acfa932091c518f87d3d71070501dd68fcebf33
BIGIP Version: Big IP 17 AS3 Version: 3.45
Agent Mode: AS3/CCCL
Orchestration: K8S/OSCP
Orchestration Version:
Pool Mode: Cluster/Nodeport
Additional Setup details: ratio load balancing, ovn with clusterip

Description

A multi-cluster setup with 2 OCP clusters in ratio mode, and an application in two clusters. Scaling down to 0 this application in one of the clusters removes all the pool members of this cluster but the irule still selects the cluster without pool members.

See attached screenshot:

Screenshot 2023-08-28 at 12 44 59

and the following ltm log

Aug 28 03:44:02 bigip2.ocp.f5-udf.com info tmm1[10337]: Rule /OpenShift-MultiCluster/Shared/test_443_tls_irule <CLIENT_DATA>: Selected pool nginx_app1_8080_eng_caas_nginx_app1
Aug 28 03:44:02 bigip2.ocp.f5-udf.com info tmm1[10337]: Rule /OpenShift-MultiCluster/Shared/test_443_tls_irule <CLIENT_DATA>: Selected pool nginx_app1_8080_eng_caas_nginx_app1
Aug 28 03:44:02 bigip2.ocp.f5-udf.com info tmm[10337]: Rule /OpenShift-MultiCluster/Shared/test_443_tls_irule <CLIENT_DATA>: Selected pool nginx_app1_8080_eng_caas_nginx_app1
Aug 28 03:44:02 bigip2.ocp.f5-udf.com info tmm1[10337]: Rule /OpenShift-MultiCluster/Shared/test_443_tls_irule <CLIENT_DATA>: Selected pool nginx_app1_8080_eng_caas_nginx_app1_ocp2
Aug 28 03:44:02 bigip2.ocp.f5-udf.com info tmm1[10337]: Rule /OpenShift-MultiCluster/Shared/test_443_tls_irule <CLIENT_DATA>: Selected pool nginx_app1_8080_eng_caas_nginx_app1_ocp2

Steps To Reproduce

1) Create a deployment with ratio load balancing 2) Scale the deployment in one of the clusters to 0 3) Send traffic

Expected Result

Transient errors are expected when scaling to 0 because of time it takes to update CIS

Actual Result

The traffic is continuously sent to both clusters, the one with and the one without pool members

trinaths commented 1 year ago

Created [CONTCNTR-4138] for internal tracking.