TykTechnologies / tyk-operator

Tyk Operator for Kubernetes
https://tyk.io
Other
198 stars 41 forks source link

method_transforms to_method works randomly #542

Closed mkyc closed 1 year ago

mkyc commented 1 year ago

with tyk.tyk.io/v1alpha1 ApiDefinition when I setup extended_paths.method_transforms.method GET -> to_method POST as described in here result is random.

Expected Behavior

when I run echo server and curl -X GET I should see POST being delivered to echo server pod.

Current Behavior

only one in every call gets transformed do POST method

Steps to Reproduce

Test stub:

apiVersion: v1
kind: Namespace
metadata:
  name: echo
---
apiVersion: v1
kind: Service
metadata:
  name: echo
  namespace: echo
spec:
  type: ClusterIP
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: http
  selector:
    app: echo
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: echo
  namespace: echo
spec:
  selector:
    matchLabels:
      app: echo
  template:
    metadata:
      labels:
        app: echo
    spec:
      containers:
      - name: echo
        image: ealen/echo-server:latest
        imagePullPolicy: IfNotPresent
        ports:
        - containerPort: 8080
          name: http
          protocol: TCP
        env:
        - name: PORT
          value: "8080"
---
apiVersion: tyk.tyk.io/v1alpha1
kind: ApiDefinition
metadata:
  name: echo
  namespace: echo
spec:
  name: echo
  use_keyless: true
  protocol: http
  active: true
  domain: echo.example.com
  proxy:
    target_url: http://echo.echo.svc:8080
    listen_path: /foo
    strip_listen_path: true
  version_data:
    default_version: v1
    not_versioned: true
    versions:
      v1:
        name: v1
        use_extended_paths: true
        paths:
          black_list: []
          ignored: []
          white_list: []
        extended_paths:
          method_transforms:
            - path: /bar
              method: GET
              to_method: POST

Test script:

#!/bin/bash
for i in {0..50}
do
  sleep 1
  date +"%T"
  curl --silent https://echo.example.com/foo/bar -X GET | jq -r '.http.method'
done

Result:

15:24:38
GET
15:24:40
GET
15:24:41
GET
15:24:42
GET
15:24:43
GET
15:24:45
POST << --- here
15:24:46
GET
15:24:47
POST << --- here
15:24:49
GET
15:24:50
GET
15:24:51
GET
15:24:52
GET
15:24:54
GET
15:24:55
GET
15:24:56
GET
15:24:57
GET
15:24:59
POST << --- here
15:25:00
GET
15:25:01
GET
15:25:03
GET
15:25:04
GET
15:25:05
GET
15:25:06
GET
15:25:08
GET
15:25:09
GET
15:25:10
GET
15:25:11
GET
15:25:13
GET
15:25:14
GET
15:25:15
GET
15:25:17
GET
15:25:18
GET
15:25:19
GET
15:25:20
GET
15:25:22
GET
15:25:23
POST << --- here
15:25:24
GET
15:25:25
GET
15:25:27
POST << --- here
15:25:28
GET
15:25:29
GET
15:25:31
GET
15:25:32
POST << --- here
15:25:33
GET
15:25:34
GET
15:25:36
GET
15:25:37
GET
15:25:38
GET
15:25:40
GET
15:25:41
GET
15:25:42
GET

Your Environment

buraksekili commented 1 year ago

hi @mkyc , thank you for raising this issue. I see that you use tyk-headless which deploys Tyk Gateway as DaemonSet by default. For reference, you can check tyk-helm-chart repository. So, if you have multiple replicas of the gateway, this might cause the issue because as far as I understand, Tyk Operator completed its responsibility, which is creating corresponding resources (in this case ApiDefinition) on Tyk Gateway.

Can you please check kind of Tyk Gateway? If it is DaemonSet, can you please try updating the kind with Deployment and try again?

mkyc commented 1 year ago

Hi, I can confirm that Tyk Gateway is installed as Deployment. Here is values.yaml file for Helm Chart.

redis:
  addrs:
  - tyk-redis-master.tyk.svc.cluster.local:6379
  pass: REDACTED

gateway:
  hostName: ""
  tls: false

  kind: Deployment

secrets:
  APISecret: LALALA
  OrgID: NANANA
buraksekili commented 1 year ago

I am not able to reproduce it on my setup. here is the result that I've obtained:

[operator-debugging] - 20:42:06
POST
[operator-debugging] - 20:42:07
POST
[operator-debugging] - 20:42:08
POST
[operator-debugging] - 20:42:09
POST
[operator-debugging] - 20:42:10
POST
[operator-debugging] - 20:42:11
POST
[operator-debugging] - 20:42:12
POST
[operator-debugging] - 20:42:13
POST
[operator-debugging] - 20:42:14
POST
[operator-debugging] - 20:42:15
POST
[operator-debugging] - 20:42:16
POST
[operator-debugging] - 20:42:17
POST
[operator-debugging] - 20:42:18
POST
[operator-debugging] - 20:42:19
POST
[operator-debugging] - 20:42:20
POST
[operator-debugging] - 20:42:21
POST
[operator-debugging] - 20:42:22
POST
[operator-debugging] - 20:42:23
POST
[operator-debugging] - 20:42:24
POST
[operator-debugging] - 20:42:25
POST
[operator-debugging] - 20:42:26
POST
[operator-debugging] - 20:42:27
POST
[operator-debugging] - 20:42:28
POST
[operator-debugging] - 20:42:29
POST
[operator-debugging] - 20:42:30
POST
[operator-debugging] - 20:42:31
POST
[operator-debugging] - 20:42:33
POST
[operator-debugging] - 20:42:34
POST
[operator-debugging] - 20:42:35
POST
[operator-debugging] - 20:42:36
POST
[operator-debugging] - 20:42:37
POST
[operator-debugging] - 20:42:38
POST
[operator-debugging] - 20:42:39
POST
[operator-debugging] - 20:42:40
POST
[operator-debugging] - 20:42:41
POST
[operator-debugging] - 20:42:42
POST
[operator-debugging] - 20:42:43
POST
[operator-debugging] - 20:42:44
POST
[operator-debugging] - 20:42:45
POST
[operator-debugging] - 20:42:46
POST
[operator-debugging] - 20:42:47
POST
[operator-debugging] - 20:42:48
POST
[operator-debugging] - 20:42:49
POST
[operator-debugging] - 20:42:50
POST
[operator-debugging] - 20:42:51
POST
[operator-debugging] - 20:42:52
POST
[operator-debugging] - 20:42:53
POST
[operator-debugging] - 20:42:54
POST
[operator-debugging] - 20:42:55
POST
[operator-debugging] - 20:42:56
POST
[operator-debugging] - 20:42:57
POST

I tried your setup on my 3 nodes cluster where Tyk Gateway is deployed as Deployment.

$ kubectl get nodes
NAME                 STATUS   ROLES           AGE   VERSION
kind-control-plane   Ready    control-plane   22d   v1.24.0
kind-worker          Ready    <none>          22d   v1.24.0
kind-worker2         Ready    <none>          22d   v1.24.0
$ kubectl get all -n tyk
NAME                                        READY   STATUS    RESTARTS     AGE
pod/gateway-tyk-headless-64756b77c7-xbrfw   1/1     Running   0            23m
pod/redis-56f99d9868-xc6vs                  1/1     Running   1 (7d ago)   10d

NAME                                 TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)         AGE
service/gateway-svc-tyk-headless     NodePort    10.96.245.40    <none>        443:31267/TCP   23m
service/redis                        ClusterIP   10.96.128.151   <none>        6379/TCP        10d

NAME                                   READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/gateway-tyk-headless   1/1     1            1           23m
deployment.apps/redis                  1/1     1            1           10d

NAME                                              DESIRED   CURRENT   READY   AGE
replicaset.apps/gateway-tyk-headless-64756b77c7   1         1         1       23m
replicaset.apps/redis-56f99d9868                  1         1         1       10d
mkyc commented 1 year ago

For some magic reason it works now here as well, but I didn't change anything in my configuration since Friday. My question is @buraksekili do you know that I should perform some additional operation after configuration is updated? Clear some cache, or do other operation? It looks to me like operator didn't update configuration fully or something like this. Also that random behaviour might be totally unrelated to operator itself but I'm trying to understand what might be a cause of my issue.

buraksekili commented 1 year ago

Usually, what I've seen is that DaemonSet causes inconsistency across multiple Gateway replicas. Since the gateway is db-less, once the API is created on one of the gateway replicas, say replica A, the other replicas are unaware of it. As a result, if the request is handled by one of the unaware Gateway replicas, the result is not what you want to see. However, your Gateway is deployed as Deployment. So, this is not the case for this time.

do you know that I should perform some additional operation after configuration is updated? Clear some cache, or do other operation?

With Tyk Operator, you do not need to perform any additional steps. All the requirements will be handled by Tyk Operator.

Could you please check your GW and operator logs for any errors that happened on Friday?  After creating APIs on Gateway, Gateway must do hot-reloads, as described here. Maybe, there was an error during hot-reloading.