F5Networks / k8s-bigip-ctlr

Repository for F5 Container Ingress Services for Kubernetes & OpenShift.
Apache License 2.0
364 stars 195 forks source link

Unexpected 503 and 409 http error codes when pushing AS3-data to BIG-IP #3500

Open joebride opened 3 months ago

joebride commented 3 months ago

Setup Details

CIS Version : 2.17.1
Build: f5networks/k8s-bigip-ctlr:latest
BIGIP Version: Big IP17.1.1.3 Build 0.211.5
AS3 Version: 3.46.2 Agent Mode: AS3 Orchestration: K8S
Orchestration Version: OpenShift Pool Mode: Cluster Additional Setup details:

Description

CIS runs in OpenShift environment and listens to OS-Routes using route-label and name-space-label (see attached CIS-arguments). It regularly reports these 503 error-messages

2024/07/23 12:17:49 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:17:55 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:18:01 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:18:07 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:18:10 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:18:14 [ERROR] [AS3] Big-IP Responded with error code: 503
[...]
2024/07/23 12:19:39 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:19:42 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:19:48 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:19:53 [ERROR] [AS3] Big-IP Responded with error code: 503
2024/07/23 12:20:03 [ERROR] [AS3] Big-IP Responded with error code: 503

I tried different versions of CIS (v2.13.1, v2.16.1) but error occurs again.

Expected Result

BIG-IP accepts AS3 data without 503/409 errors.

Actual Result

BIG-IP sends 503/409 errors when receiving AS3-declartion

Diagnostic Information

CIS-arguments

          args:
            - '--namespace-label=F5_env=<env-label>'
            - '--disable-teems=true'
            - '--bigip-username=$(BIGIP_USERNAME)'
            - '--bigip-password=$(BIGIP_PASSWORD)'
            - '--bigip-url=<mgmt-ip>'
            - '--bigip-partition=<partition-name>'
            - '--pool-member-type=cluster'
            - '--openshift-sdn-name=/Common/RD_000_TUN_VXL_0001'
            - '--log-level=INFO'
            - '--log-as3-response=true'
            - '--manage-routes=true'
            - '--manage-ingress=false'
            - '--route-label=<route-label>'
            - '--route-http-vserver=<http-vs-name>'
            - '--route-https-vserver=<httpa-vs-name>'
            - '--route-vserver-addr=<vs-addr>'
            - '--agent=as3'
            - '--insecure=true'
            - '--tls-version=1.3'
            - '--cipher-group=/Common/CIPHER_GROUP_TLSV12_TLSV13_ECDHE_ECDSA_AESGCM'

On BIG-IP I see these log-entries matching time-stamps of CIS-log entries: /var/log/ltm

err mcpd[6071]: 01020066:3: The requested value list (/Common/appsvcs/____appsvcs_lock) already exists in partition Common.
err mcpd[6071]: 01020066:3: The requested value list (/Common/appsvcs/____appsvcs_lock) already exists in partition Common.
[...]
err mcpd[6071]: 01020066:3: The requested value list (/Common/appsvcs/____appsvcs_lock) already exists in partition Common.

/var/log/restjavad-audit.0.log

[I][1200][23 Jul 2024 12:19:48 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"POST","uri":"http://localhost:8100/mgmt/shared/appsvcs/declare/","status":503,"from":"10.91.228.217"}
[I][1201][23 Jul 2024 12:19:48 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1202][23 Jul 2024 12:19:53 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1203][23 Jul 2024 12:19:53 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1204][23 Jul 2024 12:19:53 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"POST","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal","status":409,"from":"Unknown"}
[I][1205][23 Jul 2024 12:19:53 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"POST","uri":"http://localhost:8100/mgmt/shared/appsvcs/declare/","status":503,"from":"10.91.228.217"}
[I][1206][23 Jul 2024 12:19:54 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1207][23 Jul 2024 12:20:00 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1208][23 Jul 2024 12:20:00 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
[I][1209][23 Jul 2024 12:20:03 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"POST","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal","status":409,"from":"Unknown"}
[I][1210][23 Jul 2024 12:20:03 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~____appsvcs_lock","status":200,"from":"Unknown"}
[I][1211][23 Jul 2024 12:20:03 UTC][ForwarderPassThroughWorker] {"user":"local/tc01-cis","method":"POST","uri":"http://localhost:8100/mgmt/shared/appsvcs/declare/","status":503,"from":"10.91.228.217"}
[I][1212][23 Jul 2024 12:20:03 UTC][ForwarderPassThroughWorker] {"user":"local/admin","method":"PATCH","uri":"http://localhost:8100/mgmt/tm/ltm/data-group/internal/~Common~appsvcs~dataStore","status":200,"from":"Unknown"}
trinaths commented 3 months ago

@joebride From the ltm logs, it looks like AS3 issue.

Configure CIS log level to AS3DEBUG and share CIS logs to investigate any issues with CIS.

mdditt2000 commented 2 months ago

@joebride please let me know if you need some assistance. Could we setup sometime to chat. BTW nice issue number 3500!!

joebride commented 2 months ago

@mdditt2000 : yes, assistance is needed. I will contact you directly...

walkingtub commented 1 week ago

Any update on this? We face this issue frequently and only fix we know of is to scale CIS controller down and back up.

trinaths commented 1 week ago

@walkingtub recommend try CIS 2.18.1 and share your findings. IMO this issue might be from BIG-IP/AS3.