F5Networks / f5-appsvcs-extension

F5 BIG-IP Application Services 3 Extension
Apache License 2.0
163 stars 52 forks source link

CIS and AS3 working fine but causing errors and stopping sync operations #776

Closed marvagabi closed 4 months ago

marvagabi commented 8 months ago

Environment

Summary

A clear and concise description of what the bug is. Please also include information about the reproducibility and the severity/impact of the issue. We recently updated our F5 LTM system and started noticing that something was making a change to the F5 every couple seconds and stopping syncs from working. After some troubleshooting, we discovered that this was due to the F5 CIS controller. Deleting the CIS deployment in our K8S cluster stops the issues and reapplying it reproduces the issues.

Looking at the pod logs things appear to be working fine except for this error seen every couple seconds. Regardless of this error we can still make changes using CIS and the VIP's and pools created with it continue to work as expected. I'm not sure what we are missing here or why it is doing this.

Here is a sample of the logs from the k8s pod:

kubectl logs -n kube-system k8s-bigip-ctlr-7478599579-t9vcf -f
2023/11/01 21:16:45 [INFO] [INIT] Starting: Container Ingress Services - Version: 2.14.0, BuildInfo: azure-4877-06ade0db2156bc67a2df6b37682b51b7ea995367
2023/11/01 21:16:45 [INFO] ConfigWriter started: 0xc0002d54d0
2023/11/01 21:16:45 [WARNING] Creating GTM with default bigip credentials as GTM BIGIP Url or GTM BIGIP Username or GTM BIGIP Password is missing on CIS args.
2023/11/01 21:16:45 [INFO] Started config driver sub-process at pid: 17
2023/11/01 21:16:45 [INFO] [CORE] Registered BigIP Metrics
2023/11/01 21:16:50 [INFO] Starting Controller
2023/11/01 21:16:50 [INFO] Starting  Node Informer
2023/11/01 21:16:50 [INFO] Starting ExternalDNS Informer
I1101 21:16:50.241967       1 shared_informer.go:240] Waiting for caches to sync for F5 CIS Ingress Controller
I1101 21:16:50.342207       1 shared_informer.go:247] Caches are synced for F5 CIS Ingress Controller 
I1101 21:16:50.342272       1 shared_informer.go:240] Waiting for caches to sync for F5 CIS CRD Controller
2023/11/01 21:16:50 [INFO] Starting VirtualServer Informer
2023/11/01 21:16:50 [INFO] Starting TLSProfile Informer
2023/11/01 21:16:50 [INFO] Starting TransportServer Informer
2023/11/01 21:16:50 [INFO] Starting IngressLink Informer
I1101 21:16:50.443369       1 shared_informer.go:247] Caches are synced for F5 CIS CRD Controller 
2023/11/01 21:16:53 [INFO] [2023-11-01 21:16:53,503 __main__ INFO] entering inotify loop to watch /tmp/k8s-bigip-ctlr.config2865476427/config.json
2023/11/01 21:17:51 [INFO] Enqueueing TLSProfile: &{{ } {grafana-tls  monitoring  f0ba9871-8a45-4e69-898f-44b30b8d7daf 9352146 1 2023-11-01 21:17:51 +0000 UTC <nil> <nil> map[f5cr:true] map[kubectl.kubernetes.io/last-applied-configuration:{"apiVersion":"cis.f5.com/v1","kind":"TLSProfile","metadata":{"annotations":{},"labels":{"f5cr":"true"},"name":"grafana-tls","namespace":"monitoring"},"spec":{"hosts":["monitoring.uatk8s.com"],"tls":{"clientSSL":"/Common/monitoring.uatk8s-SSL-Profile","reference":"bigip","termination":"edge"}}}
] [] []  [{kubectl-client-side-apply Update cis.f5.com/v1 2023-11-01 21:17:51 +0000 UTC FieldsV1 {"f:metadata":{"f:annotations":{".":{},"f:kubectl.kubernetes.io/last-applied-configuration":{}},"f:labels":{".":{},"f:f5cr":{}}},"f:spec":{".":{},"f:hosts":{},"f:tls":{".":{},"f:clientSSL":{},"f:reference":{},"f:termination":{}}}}}]} {[monitoring.uatk8s.com] {edge /Common/monitoring.uatk8s-SSL-Profile []  [] bigip}}}
2023/11/01 21:17:51 [INFO] No VirtualServers found in namespace monitoring
2023/11/01 21:18:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
2023/11/01 21:19:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
2023/11/01 21:20:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
2023/11/01 21:21:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
2023/11/01 21:22:22 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value

This error repeats several times and somewhat correlates with the timestamps for changes in the F5. Again this doesn't appear to stop CIS from creating new VIPs or modifying the existing ones so we are not sure what we are missing here.

Steps To Reproduce

Steps to reproduce the behavior:

  1. Submit the following CIS yaml:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
    name: k8s-bigip-ctlr
    namespace: kube-system
    spec:
    replicas: 1
    selector:
    matchLabels:
      app: k8s-bigip-ctlr
    template:
    metadata:
      name: k8s-bigip-ctlr
      labels:
        app: k8s-bigip-ctlr
    spec:
      serviceAccountName: k8s-bigip-ctlr
      containers:
      - name: k8s-bigip-ctlr
        image: "f5networks/k8s-bigip-ctlr:latest"
        imagePullPolicy: Always
        env:
        - name: BIGIP_USERNAME
          valueFrom:
            secretKeyRef:
              name: bigip-login
              key: username
        - name: BIGIP_PASSWORD
          valueFrom:
            secretKeyRef:
              name: bigip-login
              key: password
        command: ["/app/bin/k8s-bigip-ctlr"]
        args: [
          "--bigip-username=$(BIGIP_USERNAME)", 
          "--bigip-password=$(BIGIP_PASSWORD)", 
          "--bigip-url=https://x.x.x.x", 
          "--insecure=true", 
          "--bigip-partition=k8s_uat", 
          "--pool-member-type=nodeport",
          "--custom-resource-mode=true"
        ]
  2. Submit the following TLS & VIP Yaml:

    apiVersion: cis.f5.com/v1
    kind: TLSProfile
    metadata:
    name: grafana-tls
    namespace: monitoring
    labels:
    f5cr: "true"
    spec:
    tls:
    termination: edge
    clientSSL: /Common/monitoring.uatk8s-SSL-Profile
    reference: bigip
    hosts:
    - monitoring.uatk8s.com
    ---
    apiVersion: cis.f5.com/v1
    kind: VirtualServer
    metadata:
    labels:
    f5cr: "true"
    name: grafana
    namespace: monitoring
    spec:
    virtualServerAddress: x.x.x.x
    virtualServerName: VIP-Grafana-UAT
    virtualServerHTTPSPort: 3000
    tlsProfileName: grafana-tls
    partition: k8s_uat
    httpTraffic: none
    pools:
    - path: /
    name: Grafana_UAT
    service: grafana
    servicePort: 3000
    loadBalancingMethod: round-robin
    monitor:
      interval: 5
      send: "GET /api/health HTTP/1.1"
      timeout: 16
      type: http
  3. Observer the following errors in the CIS pod:

    2023/11/01 21:20:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
    2023/11/01 21:21:21 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value
    2023/11/01 21:22:22 [ERROR] [AS3] Response body unmarshal failed: invalid character '<' looking for beginning of value

    Expected Behavior

    F5 CIS creates the VIP and does not throw errors and the F5 is able to complete its syncs.

Actual Behavior

F5 CIS creates the VIP as expected and it works however, we cannot perform any sync changes on the Big IP device. The pod logs show the error above.

dstokesf5 commented 5 months ago

Thank you for your feedback. This looks like a problem with CIS. I recommend filing an issue with that project: https://github.com/F5Networks/k8s-bigip-ctlr.

github-actions[bot] commented 4 months ago

This issue has been automatically closed because there has been no response to our request for more information from the original author. With only the information that is currently in the issue, we don't have enough information to take action. Please reach out if you have or find the answers we need so that we can investigate further.