F5Networks / k8s-bigip-ctlr

Repository for F5 Container Ingress Services for Kubernetes & OpenShift.
Apache License 2.0
359 stars 195 forks source link

Ingress controller can not update http_to_https redirect #1291

Closed lukibahr closed 4 years ago

lukibahr commented 4 years ago

Description

After adding two or more ingress resources to the cluster, the ingress controller is unable to apply its config.

Kubernetes Version

Kubernetes cluster is running in version 1.17.2.

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.0", GitCommit:"70132b0f130acc0bed193d9ba59dd186f0e634cf", GitTreeState:"clean", BuildDate:"2019-12-07T21:20:10Z", GoVersion:"go1.13.4", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"17", GitVersion:"v1.17.2", GitCommit:"59603c6e503c87169aea6106f57b9f242f64df89", GitTreeState:"clean", BuildDate:"2020-01-18T23:22:30Z", GoVersion:"go1.13.5", Compiler:"gc", Platform:"linux/amd64"}

Controller Version

Controller is running in version v1.13.0.

$ kubectl logs -f k8s-bigip-ctlr-7b5cf67f76-k8m27
2020/05/08 10:53:43 [INFO] Starting: Version: v1.13.0, BuildInfo: n2182-642304053
2020/05/08 10:53:43 [INFO] ConfigWriter started: 0xc00030c1e0
2020/05/08 10:53:43 [INFO] Started config driver sub-process at pid: 13
2020/05/08 10:53:43 [ERROR] EOF
2020/05/08 10:53:43 [ERROR] [AS3] Error in validating declaration
2020/05/08 10:53:43 [INFO] NodePoller (0xc0001e21b0) registering new listener: 0x11c1b50
2020/05/08 10:53:43 [INFO] NodePoller started: (0xc0001e21b0)

BIG-IP Version

BIG-IP device is running on version BIG-IP 15.1.0 Build 0.0.31 Final

Diagnostic Information

Controller logs


2020/05/08 10:54:44 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:44 [WARNING] Overwriting existing entry for backend {ServiceName:ingress1-kb-http ServicePort:<PORT> Namespace:elasticsearch}
2020/05/08 10:54:44 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:44 [INFO] [2020-05-08 10:54:44,797 f5_cccl.resource.resource INFO] Creating ApiInternalDataGroup: /<PARTITION>/https_redirect_dg
2020/05/08 10:54:44 [WARNING] Overwriting existing entry for backend {ServiceName:service1-kb-http ServicePort:<PORT> Namespace:<NAMESPACE>}
2020/05/08 10:54:44 [WARNING] Overwriting existing entry for backend {ServiceName:service1-kb-http ServicePort:<PORT> Namespace:<NAMESPACE>}
2020/05/08 10:54:44 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:44 [INFO] [2020-05-08 10:54:44,905 f5_cccl.resource.resource INFO] Creating ApiIRule: /<PARTITION>/http_redirect_irule_443
2020/05/08 10:54:44 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:46 [WARNING] Overwriting existing entry for backend {ServiceName:service2-service ServicePort:9000 Namespace:<NAMESPACE>}
2020/05/08 10:54:46 [WARNING] Overwriting existing entry for backend {ServiceName:service2-service ServicePort:9000 Namespace:<NAMESPACE>}
2020/05/08 10:54:46 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:48 [INFO] Wrote 2 Virtual Server and 0 IApp configs
2020/05/08 10:54:52 [INFO] [2020-05-08 10:54:52,408 f5_cccl.resource.resource INFO] Updating ApiInternalDataGroup: /<PARTITION>/https_redirect_dg
2020/05/08 10:54:52 [ERROR] [2020-05-08 10:54:52,570 f5_cccl.resource.resource ERROR] HTTP error(400): CCCL resource(ApiInternalDataGroup) /<PARTITION>/https_redirect_dg.
2020/05/08 10:54:52 [ERROR] [2020-05-08 10:54:52,570 f5_cccl.service.manager ERROR] F5CcclResourceRequestError - 400 Unexpected Error: Bad Request for uri: https://F5_IP_ADDRESS:443/mgmt/tm/ltm/data-group/internal/~<PARTITION>~https_redirect_dg/
2020/05/08 10:54:52 [INFO] Text: u'{"code":400,"message":"0107074b:3: Unable to change data group (/<PARTITION>/https_redirect_dg) type.  Must remove existing entries first.","errorStack":[],"apiError":3}'
2020/05/08 10:54:52 [ERROR] [2020-05-08 10:54:52,571 f5_cccl.service.manager ERROR] Resource /<PARTITION>/https_redirect_dg update error, requeuing task...
2020/05/08 10:54:52 [ERROR] [2020-05-08 10:54:52,833 __main__ ERROR] Error applying config, will try again in 1 seconds
2020/05/08 10:54:53 [INFO] [2020-05-08 10:54:53,132 f5_cccl.resource.resource INFO] Updating ApiInternalDataGroup: /<PARTITION>/https_redirect_dg
2020/05/08 10:54:53 [ERROR] [2020-05-08 10:54:53,390 f5_cccl.resource.resource ERROR] HTTP error(400): CCCL resource(ApiInternalDataGroup) /<PARTITION>/https_redirect_dg.
2020/05/08 10:54:53 [ERROR] [2020-05-08 10:54:53,390 f5_cccl.service.manager ERROR] F5CcclResourceRequestError - 400 Unexpected Error: Bad Request for uri: https://F5_IP_ADDRESS:443/mgmt/tm/ltm/data-group/internal/~<PARTITION>~https_redirect_dg/
2020/05/08 10:54:53 [INFO] Text: u'{"code":400,"message":"0107074b:3: Unable to change data group (/<PARTITION>/https_redirect_dg) type.  Must remove existing entries first.","errorStack":[],"apiError":3}'
2020/05/08 10:54:53 [ERROR] [2020-05-08 10:54:53,390 f5_cccl.service.manager ERROR] Resource /<PARTITION>/https_redirect_dg update error, requeuing task...
2020/05/08 10:54:54 [INFO] [2020-05-08 10:54:54,162 f5_cccl.resource.resource INFO] Updating ApiInternalDataGroup: /<PARTITION>/https_redirect_dg
2020/05/08 10:54:54 [ERROR] [2020-05-08 10:54:54,437 f5_cccl.resource.resource ERROR] HTTP error(400): CCCL resource(ApiInternalDataGroup) /<PARTITION>/https_redirect_dg.
2020/05/08 10:54:54 [ERROR] [2020-05-08 10:54:54,437 f5_cccl.service.manager ERROR] F5CcclResourceRequestError - 400 Unexpected Error: Bad Request for uri: https://F5_IP_ADDRESS:443/mgmt/tm/ltm/data-group/internal/~<PARTITION>~https_redirect_dg/
2020/05/08 10:54:54 [INFO] Text: u'{"code":400,"message":"0107074b:3: Unable to change data group (/<PARTITION>/https_redirect_dg) type.  Must remove existing entries first.","errorStack":[],"apiError":3}'
2020/05/08 10:54:54 [ERROR] [2020-05-08 10:54:54,438 f5_cccl.service.manager ERROR] Resource /<PARTITION>/https_redirect_dg update error, requeuing task...
2020/05/08 10:54:54 [ERROR] [2020-05-08 10:54:54,746 __main__ ERROR] Error applying config, will try again in 2 seconds

The ingress resources are separate manifests. Each manifest looks like the following

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: <NAME>
  namespace: <NAMESPACE>
  annotations:
    virtual-server.f5.com/ip: "controller-default"
    virtual-server.f5.com/partition: "<PARTITION>"
    virtual-server.f5.com/balance: "round-robin"
    ingress.kubernetes.io/allow-http: "false"
    ingress.kubernetes.io/ssl-redirect: "true"
    virtual-server.f5.com/health: |
      [
        {
          "path":     "<PATH>/",
          "send":     "HTTP GET /",
          "interval": 5,
          "timeout":  10
        }
      ]
    kubernetes.io/ingress.class: "f5"
spec:
  tls:
    - secretName: <SECRET_NAME
  rules:
    - host: <PATH>
      http:
        paths:
          - backend:
              serviceName: <SERVICE_NAME>
              servicePort: 9100
            path: /

Using the controller with our BigIP appliance is currently not working either. According to #975, the issue said it is fixed however it isn't.

mdditt2000 commented 4 years ago

Internal PM Jira for tracking CONTCNTR-1820.

I am seeing the same issue with the host. Made the following changes and it works. Please review my findings below.

https://github.com/mdditt2000/prometheus/commit/33db2a573a9747303ba54e7b1cfc464ba45610a4

mdditt2000 commented 4 years ago

afternoon @lukibahr i did some additional testing. The following ingress works

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata: 
  annotations: 
    ingress.kubernetes.io/allow-http: "false"
    ingress.kubernetes.io/ssl-redirect: "true"
    virtual-server.f5.com/ip: "10.192.75.107"
  name: prometheus-ui
  namespace: monitoring
spec: 
  backend: 
    serviceName: prometheus-service
    servicePort: 8080
  tls: 
  - secretName: /Common/clientssl

Please can you make sure you using CIS 1.14 and above.

mdditt2000 commented 4 years ago

Working example - https://github.com/mdditt2000/prometheus/blob/master/prometheus-ingress.yaml

lukibahr commented 4 years ago

Your provided example worked. However, if we create a second ingress, which also has http-to-https redirect, it fails. It says that an object with the name https_redirect_dg in the partition already exists, obviously because it has the same name. This http_redirect_dg was created by the first ingress object which has http to https redirection enabled.

image

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: <NAME_A>
  namespace: <NAMESPACE>
  annotations:
    virtual-server.f5.com/ip: "controller-default"
    virtual-server.f5.com/partition: "<PARTITION>"
    virtual-server.f5.com/balance: "round-robin"
    ingress.kubernetes.io/allow-http: "false"
    ingress.kubernetes.io/ssl-redirect: "true"
    virtual-server.f5.com/health: |
      [
        {
          "path":     "<PATH>/",
          "send":     "HTTP GET /",
          "interval": 5,
          "timeout":  10
        }
      ]
    kubernetes.io/ingress.class: "f5"
spec:
  tls:
    - secretName: <SECRET_NAME
  rules:
    - host: <DOMAIN_A>
      http:
        paths:
          - backend:
              serviceName: <SERVICE_NAME_A>
              servicePort: 9100
            path: /

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: <NAME_B>
  namespace: <NAMESPACE>
  annotations:
    virtual-server.f5.com/ip: "controller-default"
    virtual-server.f5.com/partition: "<PARTITION>"
    virtual-server.f5.com/balance: "round-robin"
    ingress.kubernetes.io/allow-http: "false"
    ingress.kubernetes.io/ssl-redirect: "true"
    virtual-server.f5.com/health: |
      [
        {
          "path":     "<PATH>/",
          "send":     "HTTP GET /",
          "interval": 5,
          "timeout":  10
        }
      ]
    kubernetes.io/ingress.class: "f5"
spec:
  tls:
    - secretName: <SECRET_NAME
  rules:
    - host: <DOMAIN_B>
      http:
        paths:
          - backend:
              serviceName: <SERVICE_NAME_B>
              servicePort: 9100
            path: /         
iam-veeramalla commented 4 years ago

@lukibahr Can you answer the below questions?

What is the CIS version that you are using? Are the two ingresses created are in different namespaces? Are you running controller in cccl or AS3 mode ?

iam-veeramalla commented 4 years ago

@lukibahr on further analysis, I understood that this issue persists with the CCCL f5-sdk. It is observed on BIG-IP with versions 14.x and above.

Please consider updating the controller to AS3 mode.

lukibahr commented 4 years ago

@veeramalla-f5 we're running in cccl mode, that's true. Could you provide a specific example on how to run the controller in AS3 mode or do you have a quickstart guide for this?

iam-veeramalla commented 4 years ago

@lukibahr You can achieve it by adding the below parameter in your controller deployment.yml "--agent=as3"

Please go through the below doc for complete reference. https://clouddocs.f5.com/containers/v2/kubernetes/kctlr-use-as3-backend.html

lukibahr commented 4 years ago

I've successfully enabled AS3 but get the following error:

$ kubectl logs -f k8s-bigip-ctlr-d7d5b766b-tkjc8
2020/05/18 09:03:54 [INFO] Starting: Version: 1.14.0, BuildInfo: n2285-661543171
2020/05/18 09:03:54 [INFO] ConfigWriter started: 0xc00030c1e0
2020/05/18 09:03:54 [INFO] Started config driver sub-process at pid: 14
2020/05/18 09:03:54 [INFO] NodePoller (0xc00012e360) registering new listener: 0x11c3400
2020/05/18 09:03:54 [INFO] NodePoller started: (0xc00012e360)
2020/05/18 09:03:54 [INFO] Watching Ingress resources.
2020/05/18 09:03:54 [INFO] Watching ConfigMap resources.
2020/05/18 09:03:54 [INFO] Handling ConfigMap resource events.
2020/05/18 09:03:54 [INFO] Handling Ingress resource events.
2020/05/18 09:03:54 [INFO] Registered BigIP Metrics
2020/05/18 09:03:55 [INFO] Successfully Sent the FDB Records
2020/05/18 09:03:59 [INFO] [2020-05-18 09:03:59,791 __main__ INFO] entering inotify loop to watch /tmp/k8s-bigip-ctlr.config729148201/config.json
2020/05/18 09:04:04 [ERROR] [2020-05-18 09:04:04,777 __main__ ERROR] Unexpected error
2020/05/18 09:04:04 [INFO] Traceback (most recent call last):
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 323, in _do_reset
2020/05/18 09:04:04 [INFO]     incomplete = self._update_cccl(config)
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 397, in _update_cccl
2020/05/18 09:04:04 [INFO]     incomplete += mgr._apply_ltm_config(cfg_ltm)
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 117, in _apply_ltm_config
2020/05/18 09:04:04 [INFO]     return self._cccl.apply_ltm_config(config)
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/api.py", line 92, in apply_ltm_config
2020/05/18 09:04:04 [INFO]     self._user_agent)
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/service/manager.py", line 670, in apply_ltm_config
2020/05/18 09:04:04 [INFO]     desired_config, default_route_domain)
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/service/manager.py", line 375, in deploy_ltm
2020/05/18 09:04:04 [INFO]     self._pre_deploy_legacy_ltm_cleanup()
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/service/manager.py", line 272, in _pre_deploy_legacy_ltm_cleanup
2020/05/18 09:04:04 [INFO]     self._bigip.refresh_ltm()
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/bigip.py", line 136, in refresh_ltm
2020/05/18 09:04:04 [INFO]     self._refresh_ltm()
2020/05/18 09:04:04 [INFO]   File "/app/src/f5-cccl/f5_cccl/bigip.py", line 240, in _refresh_ltm
2020/05/18 09:04:04 [INFO]     requests_params={"params": query})
2020/05/18 09:04:04 [INFO]   File "/usr/local/lib/python3.6/site-packages/f5/bigip/resource.py", line 781, in get_collection
2020/05/18 09:04:04 [INFO]     self.refresh(**kwargs)
2020/05/18 09:04:04 [INFO]   File "/usr/local/lib/python3.6/site-packages/f5/bigip/resource.py", line 651, in refresh
2020/05/18 09:04:04 [INFO]     self._refresh(**kwargs)
2020/05/18 09:04:04 [INFO]   File "/usr/local/lib/python3.6/site-packages/f5/bigip/resource.py", line 634, in _refresh
2020/05/18 09:04:04 [INFO]     response = refresh_session.get(uri, **requests_params)
2020/05/18 09:04:04 [INFO]   File "/usr/local/lib/python3.6/site-packages/icontrol/session.py", line 271, in wrapper
2020/05/18 09:04:04 [INFO]     raise iControlUnexpectedHTTPError(error_message, response=response)
2020/05/18 09:04:04 [INFO] icontrol.exceptions.iControlUnexpectedHTTPError: 503 Unexpected Error: Service Unavailable for uri: https://API_ENDPOINT>:443/mgmt/tm/ltm/monitor/udp/?$filter=partition+eq+k8s_dev_part_ieku6ui9du
2020/05/18 09:04:04 [INFO] Text: '{"code":503,"message":"There is an active asynchronous task executing.","errorStack":[],"apiError":32964609}'
2020/05/18 09:04:04 [ERROR] [2020-05-18 09:04:04,781 __main__ ERROR] Error applying config, will try again in 1 seconds
2020/05/18 09:04:14 [ERROR] [2020-05-18 09:04:14,363 __main__ ERROR] Unexpected error
2020/05/18 09:04:14 [INFO] Traceback (most recent call last):
2020/05/18 09:04:14 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 323, in _do_reset
2020/05/18 09:04:14 [INFO]     incomplete = self._update_cccl(config)
2020/05/18 09:04:14 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 397, in _update_cccl
2020/05/18 09:04:14 [INFO]     incomplete += mgr._apply_ltm_config(cfg_ltm)
2020/05/18 09:04:14 [INFO]   File "/app/src/f5-ctlr-agent/f5_ctlr_agent/bigipconfigdriver.py", line 117, in _apply_ltm_config
2020/05/18 09:04:14 [INFO]     return self._cccl.apply_ltm_config(config)
2020/05/18 09:04:14 [INFO]   File "/app/src/f5-cccl/f5_cccl/api.py", line 92, in apply_ltm_config
2020/05/18 09:04:14 [INFO]     self._user_agent)
2020/05/18 09:04:14 [ERROR] [2020-05-18 09:04:14,364 __main__ ERROR] Error applying config, will try again in 2 seconds

Can't figure out what this means, howevere I found a concurrent open issue https://github.com/F5Networks/k8s-bigip-ctlr/issues/1137

iam-veeramalla commented 4 years ago

@lukibahr This should not block the resource creation in BIG-IP. However, we are looking into fixing this issue asap.

Do u see anyother issues ?

lukibahr commented 4 years ago

do we have to take care on any other configurations? I've created the ingress configuration as mentioned above, however no objects are created. DEBUG logs show no errors.

iam-veeramalla commented 4 years ago

@lukibahr U should install AS3 RPM on the Big-IP. Did u get that installed? If not, Please refer.

https://clouddocs.f5.com/products/extensions/f5-appsvcs-extension/latest/userguide/installation.html

Once u install this RPM, Please restart the CIS Deployment in your cluster. If u get any further issue. Please send us error and AS3 specific logs in ur deployment.

kubectl log deploy/cis | grep AS3 kubectl log deploy/cis | grep -i error

lukibahr commented 4 years ago

Currently facing an issue with the length of the ingress:

k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:01 [DEBUG] Updated 0 of 0 virtual server configs, deleted 0
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:02 [ERROR] AS3 Template is not valid. see errors :
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:02 [ERROR] - declaration.Shared: Must validate "then" as "if" was valid
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:02 [ERROR] - declaration.Shared: Property name of "ingress_elasticsearch_kibana_logging_search_cluster_kb_http_0_http" does not match
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:02 [ERROR] - declaration.Shared: String length must be less than or equal to 64
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr
k8s-bigip-ctlr-94b779bc5-qpxlb k8s-bigip-ctlr 2020/05/22 13:54:02 [ERROR] [AS3] Error in validating declaration

I'll fix that and let you know if it works.

iam-veeramalla commented 4 years ago

@lukibahr Sure, Awaiting your response.

iam-veeramalla commented 4 years ago

@lukibahr Try using --log-as3-response=true deployment parameter. It will dump the AS3 Response body with actual error in controller logs. You can post it to us, by hiding sensitive information, if any.

we have released 2.0 and new documentation is available at

Let me know, If u still face any issues.

lukibahr commented 4 years ago

Our as3 configuration of the controller works now. Thank you very very much for your help.