F5Networks / k8s-bigip-ctlr

Repository for F5 Container Ingress Services for Kubernetes & OpenShift.
Apache License 2.0
355 stars 193 forks source link

CIS in CRD Mode creates GTM wide IP and pool but doesn't assign pool members #1601

Closed musa-osmani closed 3 years ago

musa-osmani commented 3 years ago

Setup Details

CIS Version : /app/bin/k8s-bigip-ctlr
Build: f5networks/k8s-bigip-ctlr:latest
BIGIP Version: Big IP x.x.x
AS3 Version: 3.x
Agent Mode: AS3/CCCL
Orchestration: OSCP
Orchestration Version:
Pool Mode: Nodeport
Additional Setup details: <Platform/CNI Plugins/ cluster nodes/ etc>

Description

Steps To Reproduce

1) Configure CIS in CRD mode - '--custom-resource-mode=true' 2) install the latest CRD definitions for ExternalDNS 3) create an ExternalDNS service

apiVersion: "cis.f5.com/v1"
kind: ExternalDNS
metadata:
  name: exdns
  labels:
    f5cr: "true"
spec:
  domainName: example.com
  dnsRecordType: A
  loadBalanceMethod: round-robin
  pools:
  - name: example.site1.com
    dnsRecordType: A
    loadBalanceMethod: round-robin
    dataServerName: /Partition/visrual_server
    monitor:
      type: https
      send: "GET /"
      recv: ""
      interval: 10
      timeout: 10

Expected Result

create monitor (if monitor defined in service creation) create GTM pool and assign the monitor. assign members to the GTM pool assign pool member as stated in service creation( dataServerName:) create WideIP attach GTM pool to the Wide IP

Actual Result

Based on the traffic captured the CIS behaves as the following:

1.) HTTP Get monitor name (poolname_monitor) GET /mgmt/tm/gtm/pool/a/~Common~example.com HTTP/1.1

if monitor doesn't exist CIS receives HTTP 404 code not found from F5. HTTP/1.1 404 Not Found Date: Mon, 07 Dec 2020 00:05:55 GMT Server: Jetty(9.2.22.v20170606)

CIS sends HTTP PUT request with the parameters for creating the monitor: PUT /mgmt/tm/gtm/monitor/http/~Common~example.com_monitor/ HTTP/1.1

  1. ) the CIS than checks if the pool exists sending a GET request to F5. GET /mgmt/tm/gtm/pool/a/~Common~example.com HTTP/1.1 If the F5 responds with HTTP 404 code "Not Found", CIS sends a post request to create the new pool:

    POST /mgmt/tm/gtm/pool/a/ HTTP/1.1 {"name": "example.com", "partition": "Common", "monitor": "/Common/example.com_monitor"}

As you can see in the POST request the pool is being created, but no members created yet.

There should be another HTTP call in step 2 to add members to the pool created above:

method POST: POST /mgmt/tm/gtm/pool/a/example.com/members

BODY:

{"name": "/Partition/virtualserver" }

Since the CRD ExternalDNS supports only one dataServerName value, the step of adding the pool members is executed only ones.

3.) CIS checks if wide IP exist, GET /mgmt/tm/gtm/wideip/a/~Common~example.com HTTP/1.1 if F5 responds fith HTTP code 404 "Not Found" , CIS sends a POST request to create the wide IP.

POST /mgmt/tm/gtm/wideip/a/ HTTP/1.1

{"name": "example.com", "partition": "Common"}HTTP/1.1 200 OK

4.) CIS checks if Wide IP was created by sending a GET request to F5 GET /mgmt/tm/gtm/wideip/a/~Common~example.com HTTP/1.1

If F5 responds with HTTP 200 OK ( the wide IP was created)
The next step CIS sends a HTTP PUT request to F5 to attach the pool to the Wide IP:

PUT /mgmt/tm/gtm/wideip/a/~Common~example.com/ HTTP/1.1

{"enabled": true, "failureRcode": "noerror", "failureRcodeResponse": "disabled", "failureRcodeTtl": 0, "fullPath": "/Common/example.com", "generation": 4406, "kind": "tm:gtm:wideip:a:astate", "lastResortPool": "", "minimalResponse": "enabled", "name": "example.com", "partition": "Common", "persistCidrIpv4": 32, "persistCidrIpv6": 128, "persistence": "disabled", "poolLbMode": "round-robin", "pools": [{"name": "example.com", "partition": "Common", "ratio": 1}], "selfLink": "https://localhost/mgmt/tm/gtm/wideip/a/~Common~example.com?ver=14.1.2.3", "topologyPreferEdns0ClientSubnet": "disabled", "ttlPersistence": 3600}

Diagnostic Information

<Configuration files, error messages, logs>
Note: Sanitize the data. For example, be mindful of IPs, ports, application names and URLs
Note: The following F5 article outlines the information required when opening an issue.
https://support.f5.com/csp/article/K60974137

Observations (if any)

trinaths commented 3 years ago

Can you share the VS CRD, and verify that, dataServerName: /Partition/visrual_server really exists? You need to make sure that host in VS CRD is same referenced in EDNS CRD domainName.

musa-osmani commented 3 years ago

Hi The virtual server exists , attached both CRD files EDNS and VS.

VS.txt Wideip.txt

musa-osmani commented 3 years ago

HI i checked the code and i think the problem is in the file /pkg/crmanager/worker.go function syncExternalDNS

seems that in the loop below there is no match between EDNS domainName and VS host values and the variable found remains false, for these reason the pool members append doesn't execute:

for vsName, vs := range crMgr.resources.rsMap { var found bool for _, host := range vs.MetaData.hosts { if host == edns.Spec.DomainName { found = true break } } if found { log.Debugf("Adding WideIP Pool Member: %v", fmt.Sprintf("%v:/%v/Shared/%v", pl.DataServerName, DEFAULT_PARTITION, vsName)) pool.Members = append( pool.Members, fmt.Sprintf("%v:/%v/Shared/%v", pl.DataServerName, DEFAULT_PARTITION, vsName), ) } }

there may be empty spaces between VS host value and EDNS domainName value.

Also it should be considered in case of TransportServer that doesn't have the host field and can be used for example as a F5 VIP for the container IngressController , the check in the code above will not be enough.

trinaths commented 3 years ago

@musa-osmani - One issue i see here in Wideip.txt is dataServerName: /openshift/Shared/example_coffe_80 , referring to the docs,

Note:

To set up external DNS using BIG-IP GTM user needs to first manually 
configure GSLB → Datacenter and GSLB → Server on BIG-IP Common partition.

Can you try dataServerName with /Common partition? Please share BIG-IP Version, dataServerName your findings, we can resolve the issue asap.

trinaths commented 3 years ago

JFYR,

Example EDNS CRD:

apiVersion: cis.f5.com/v1
kind: ExternalDNS
metadata:
  name: bar-h2
  labels:
    f5cr: "true"
spec:
  domainName: bar.com
  dnsRecordType: A
  loadBalanceMethod: round-robin
  pools:
  - name: h2.bar.com
    dnsRecordType: A
    loadBalanceMethod: round-robin
    dataServerName: "/Common/SiteH2_Server"         <======= reference common partition
    monitor:
      type: https
      send: "GET /health\r\n"
      recv: ""
      interval: 10
      timeout: 10

VS CRD:

apiVersion: cis.f5.com/v1
kind: VirtualServer
metadata:
  labels:
    f5cr: "true"
  name: bar-com-vs-full-path
  namespace: default
spec:
  host: bar.com
  httpTraffic: none
  pools:
  - monitor:
      interval: 20
      recv: ""
      send: /
      timeout: 10
      type: http
    path: /
    service: svc-3
    servicePort: 80
  snat: auto
  virtualServerAddress: 172.16.3.50
musa-osmani commented 3 years ago

Hi I did some additional test and code analysis and now it is clear, the configuration works. The GTM resource is intended to create a GTM wide IP and pool only if a previous VS was created and EDNS CRD field domainName matches the VS CRD host field.

The case can be closed.