Kong / kong

🦍 The Cloud-Native API Gateway and AI Gateway.
https://konghq.com/install/#kong-community
Apache License 2.0
39.05k stars 4.79k forks source link

0.10.0rc3 admin API throwing error #2092

Closed s4mur4i closed 7 years ago

s4mur4i commented 7 years ago

Summary

I provisioned a single pod kong setup in kubernetes. after successful deployment when trying to add an api I get following error: {"message":"An unexpected error occurred"}

When trying from command line:

curl -i -X POST --url http://x.x.x.x:30007/apis/  --data 'name=mockbin'   --data 'upstream_url=http://mockbin.com/'   --data 'hosts=mockbin.com'
HTTP/1.1 500 Internal Server Error
Date: Wed, 15 Feb 2017 14:16:44 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Access-Control-Allow-Origin: *
Server: kong/0.10.0rc3

{"message":"An unexpected error occurred"} 

In logs I do not see anythin, and KONG_LOG_LEVEL is set to debug

2017/02/15 14:16:39 [error] 94#0: *4931 [lua] postgres.lua:158: [postgres] could not cleanup TTLs: [toip() name lookup failed]: dns server error; 2 server failure, context: ngx.timer
2017/02/15 14:16:44 [error] 94#0: *4953 [lua] responses.lua:101: handler(): [toip() name lookup failed]: dns server error; 2 server failure, client: 172.20.53.128, server: kong_admin, request: "POST /apis/ HTTP/1.1", host: "x:30007"
2017/02/15 14:16:44 [info] 94#0: *4953 client 172.20.53.128 closed keepalive connection

Dns resolution within kong is working, other services are resolved correctly within kubernetes. First line as I saw on other issue is nothing, 2 and 3 don;t really show anything relevant for me.

Steps To Reproduce

  1. Create kubernetes secret for postgresql username, password and database
  2. Create postgres instance in kubernetes:
    apiVersion: v1
    kind: Service
    metadata:
    name: postgres
    spec:
    ports:
    - name: pgql
    port: 5432
    targetPort: 5432
    protocol: TCP
    selector:
    app: postgres
    ---
    apiVersion: v1
    kind: ReplicationController
    metadata:
    name: postgres
    spec:
    replicas: 1
    template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
        - name: postgres
          image: postgres:9.4
          env:
            - name: POSTGRES_USER
              valueFrom:
                secretKeyRef:
                  key: USERNAME
                  name: postgresql
            - name: POSTGRES_PASSWORD
              valueFrom:
                secretKeyRef:
                  key: PASSWORD
                  name: postgresql
            - name: POSTGRES_DB
              valueFrom:
                secretKeyRef:
                  key: DATABASE
                  name: postgresql
            - name: PGDATA
              value: /var/lib/postgresql/data/pgdata
          ports:
            - containerPort: 5432
          volumeMounts:
            - mountPath: /var/lib/postgresql/data
              name: pg-data
      volumes:
        - name: pg-data
          emptyDir: {}
  3. Create kong in kubernetes:
    
    apiVersion: v1
    kind: Service
    metadata:
    labels:
    app: kong-http
    name: kong-http
    spec:
    type: NodePort
    ports:
    - port: 8000
    targetPort: 8000
    nodePort: 30008
    selector:
    app: kong

apiVersion: v1 kind: Service metadata: labels: app: kong-admin name: kong-admin spec: type: NodePort ports:


apiVersion: extensions/v1beta1 kind: Deployment metadata: labels: app: kong name: kong spec: replicas: 1 template: metadata: labels: name: kong app: kong spec: containers:

Also tested with injecting KONG_ADMIN_SSL to false but doesnt help

Additional Details & Logs

shashiranjan84 commented 7 years ago

@s4mur4i just to test, try setting fqdn address postgres.default.svc.cluster.local for postgres service to KONG_PG_HOST.

s4mur4i commented 7 years ago

@shashiranjan84 tried it, didnt help, for some reason the admin interface could not even be reached:

2017/02/15 20:18:34 [verbose] Kong: 0.10.0rc3
2017/02/15 20:18:34 [debug] ngx_lua: 10007
2017/02/15 20:18:34 [debug] nginx: 1011002
2017/02/15 20:18:34 [debug] Lua: LuaJIT 2.1.0-beta2
2017/02/15 20:18:34 [debug] PRNG seed: 757556596022
2017/02/15 20:18:34 [verbose] no config file found at /etc/kong/kong.conf
2017/02/15 20:18:34 [verbose] no config file found at /etc/kong.conf
2017/02/15 20:18:34 [verbose] no config file, skipping loading
2017/02/15 20:18:34 [debug] KONG_PG_PASSWORD ENV found with "******"
2017/02/15 20:18:34 [debug] KONG_LOG_LEVEL ENV found with "debug"
2017/02/15 20:18:34 [debug] KONG_PG_DATABASE ENV found with "kong"
2017/02/15 20:18:34 [debug] KONG_PG_HOST ENV found with "postgres.default.svc.cluster.local"
2017/02/15 20:18:34 [debug] KONG_PG_USER ENV found with "kong"
2017/02/15 20:18:34 [debug] KONG_DATABASE ENV found with "postgres"
2017/02/15 20:18:34 [debug] admin_listen = "0.0.0.0:8001"
2017/02/15 20:18:34 [debug] admin_listen_ssl = "0.0.0.0:8444"
2017/02/15 20:18:34 [debug] admin_ssl = true
2017/02/15 20:18:34 [debug] anonymous_reports = true
2017/02/15 20:18:34 [debug] cassandra_consistency = "ONE"
2017/02/15 20:18:34 [debug] cassandra_contact_points = {"127.0.0.1"}
2017/02/15 20:18:34 [debug] cassandra_data_centers = {"dc1:2","dc2:3"}
2017/02/15 20:18:34 [debug] cassandra_keyspace = "kong"
2017/02/15 20:18:34 [debug] cassandra_lb_policy = "RoundRobin"
2017/02/15 20:18:34 [debug] cassandra_port = 9042
2017/02/15 20:18:34 [debug] cassandra_repl_factor = 1
2017/02/15 20:18:34 [debug] cassandra_repl_strategy = "SimpleStrategy"
2017/02/15 20:18:34 [debug] cassandra_ssl = false
2017/02/15 20:18:34 [debug] cassandra_ssl_verify = false
2017/02/15 20:18:34 [debug] cassandra_timeout = 5000
2017/02/15 20:18:34 [debug] cassandra_username = "kong"
2017/02/15 20:18:34 [debug] cluster_listen = "0.0.0.0:7946"
2017/02/15 20:18:34 [debug] cluster_listen_rpc = "127.0.0.1:7373"
2017/02/15 20:18:34 [debug] cluster_profile = "wan"
2017/02/15 20:18:34 [debug] cluster_ttl_on_failure = 3600
2017/02/15 20:18:34 [debug] custom_plugins = {}
2017/02/15 20:18:34 [debug] database = "postgres"
2017/02/15 20:18:34 [debug] dns_hostsfile = "/etc/hosts"
2017/02/15 20:18:34 [debug] dns_resolver = {}
2017/02/15 20:18:34 [debug] log_level = "debug"
2017/02/15 20:18:34 [debug] lua_code_cache = "on"
2017/02/15 20:18:34 [debug] lua_package_cpath = ""
2017/02/15 20:18:34 [debug] lua_package_path = "?/init.lua;./kong/?.lua"
2017/02/15 20:18:34 [debug] lua_ssl_verify_depth = 1
2017/02/15 20:18:34 [debug] mem_cache_size = "128m"
2017/02/15 20:18:34 [debug] nginx_daemon = "on"
2017/02/15 20:18:34 [debug] nginx_optimizations = true
2017/02/15 20:18:34 [debug] nginx_worker_processes = "auto"
2017/02/15 20:18:34 [debug] pg_database = "kong"
2017/02/15 20:18:34 [debug] pg_host = "postgres.default.svc.cluster.local"
2017/02/15 20:18:34 [debug] pg_password = "******"
2017/02/15 20:18:34 [debug] pg_port = 5432
2017/02/15 20:18:34 [debug] pg_ssl = false
2017/02/15 20:18:34 [debug] pg_ssl_verify = false
2017/02/15 20:18:34 [debug] pg_user = "kong"
2017/02/15 20:18:34 [debug] prefix = "/usr/local/kong/"
2017/02/15 20:18:34 [debug] proxy_listen = "0.0.0.0:8000"
2017/02/15 20:18:34 [debug] proxy_listen_ssl = "0.0.0.0:8443"
2017/02/15 20:18:34 [debug] serf_path = "serf"
2017/02/15 20:18:34 [debug] ssl = true
2017/02/15 20:18:34 [debug] upstream_keepalive = 60
2017/02/15 20:18:34 [verbose] prefix in use: /usr/local/kong
2017/02/15 20:18:34 [debug] sending signal to pid at: /usr/local/kong/pids/nginx.pid
2017/02/15 20:18:34 [debug] kill -0 `cat /usr/local/kong/pids/nginx.pid` >/dev/null 2>&1

and in error logs:

2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: udp-log
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: cors
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: file-log
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: ip-restriction
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: datadog
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: request-size-limiting
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: bot-detection
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: aws-lambda
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: statsd
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:126: No API endpoints loaded for plugin: response-ratelimiting
2017/02/15 20:16:30 [debug] 82#0: *5 [lua] init.lua:123: Loading API endpoints for plugin: hmac-auth
2017/02/15 20:16:30 [notice] 82#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 82#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 80#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 80#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 81#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 81#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 83#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 83#0: waitpid() failed (10: No child processes)
shashiranjan84 commented 7 years ago

@s4mur4i I dont see anything unusual in error log. can you log into kong pod and check if Kong running?

run kong health

s4mur4i commented 7 years ago

@shashiranjan84 of course. Just for test ran other commands aswell:

# kong health 
serf........running
nginx.......running

Kong is healthy at /usr/local/kong

# kong cluster members
kong-1471082879-smbth_0.0.0.0:7946_61f6a89b065d4d029809ffba75adca87 127.0.0.1:7946  alive

# kong check /usr/local/kong/kong.conf 
configuration at /usr/local/kong/kong.conf is valid
shashiranjan84 commented 7 years ago

@s4mur4i So Kong starting, try to add API from pod, if you still get DNS error try Kuberneter DNS resolver instead of Kong default setup.

s4mur4i commented 7 years ago

@shashiranjan84 When inside container I hit API:

[root@kong-1471082879-smbth /]# curl -v -i -X POST --url http://localhost:8001/apis/  --data 'name=mockbin'   --data 'upstream_url=http://mockbin.com/'   --data 'hosts=mockbin.com'    
* About to connect() to localhost port 8001 (#0)
*   Trying ::1...
* Connection refused
*   Trying 127.0.0.1...
* Connected to localhost (127.0.0.1) port 8001 (#0)
> POST /apis/ HTTP/1.1
> User-Agent: curl/7.29.0
> Host: localhost:8001
> Accept: */*
> Content-Length: 63
> Content-Type: application/x-www-form-urlencoded
> 
* upload completely sent off: 63 out of 63 bytes

And it freezes. I left it for like ten minutes and nothing happens. There is no logs

tail from /usr/local/kong/logs/error.log
2017/02/15 20:16:30 [notice] 82#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 82#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 80#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 80#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 81#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 81#0: waitpid() failed (10: No child processes)
2017/02/15 20:16:30 [notice] 83#0: signal 17 (SIGCHLD) received
2017/02/15 20:16:30 [info] 83#0: waitpid() failed (10: No child processes)

[root@kong-1471082879-smbth /]# ls -lrt /usr/local/kong/logs/
total 28
-rw-r--r-- 1 root root     0 Feb 15 20:16 admin_access.log
-rw-r--r-- 1 root root     0 Feb 15 20:16 access.log
-rw-r--r-- 1 root root   368 Feb 15 20:16 serf.log
-rw-r--r-- 1 root root 21628 Feb 15 20:16 error.log
[root@kong-1471082879-smbth /]# date
Thu Feb 16 10:20:01 UTC 2017

[root@kong-1471082879-smbth /]# kong health
serf........running
nginx.......running

Kong is healthy at /usr/local/kong

Looking at dmesg I see:

[3091575.713281] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[3092239.703837] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[3092239.744846] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[3092442.675838] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[3092442.713885] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[3095131.647453] clocksource: Override clocksource tsc is not HRT compatible - cannot switch while in HRT/NOHZ mode
[3095179.418552] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[3095179.442148] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

I do not know if it might be related or not

shashiranjan84 commented 7 years ago

@s4mur4i I did run stack last night and experienced same issue with KongV0.10.0rc3. It works fine with older versions. I will debug it with my team. Stay tuned for updates.

s4mur4i commented 7 years ago

@shashiranjan84 thanks for the update, if you need any further info, please let me know.

shashiranjan84 commented 7 years ago

@s4mur4i meanwhile we figure out the DNS issue try using following yaml file to provision Kong.

apiVersion: v1
kind: Service
metadata:
  name: kong-proxy
spec:
  type: LoadBalancer
  loadBalancerSourceRanges:
  - 0.0.0.0/0
  ports:
  - name: kong-proxy
    port: 8000
    targetPort: 8000
    protocol: TCP
  - name: kong-proxy-ssl
    port: 8443
    targetPort: 8443
    protocol: TCP
  selector:
    app: kong

---
apiVersion: v1
kind: Service
metadata:
  name: kong-admin
spec:
  type: LoadBalancer
  loadBalancerSourceRanges:
  - 0.0.0.0/0
  ports:
  - name: kong-admin
    port: 8001
    targetPort: 8001
    protocol: TCP
  selector:
    app: kong

---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: kong-rc
spec:
  replicas: 3
  template:
    metadata:
      labels:
        name: kong-rc
        app: kong
    spec:
      containers:
      - name: kong
        image: mashape/kong:0.10.0rc3
        env:
          - name: KONG_PG_PASSWORD
            value: kong
          - name: KONG_PG_HOST
            value: $(POSTGRES_SERVICE_HOST)
          - name: KONG_HOST_IP
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: status.podIP
        command: [ "/bin/sh", "-c", "KONG_CLUSTER_ADVERTISE=$(KONG_HOST_IP):7946 KONG_NGINX_DAEMON='off' kong start && env" ]
        ports:
        - name: admin
          containerPort: 8001
          protocol: TCP
        - name: proxy
          containerPort: 8000
          protocol: TCP
        - name: proxy-ssl
          containerPort: 8443
          protocol: TCP
        - name: surf-tcp
          containerPort: 7946
          protocol: TCP
        - name: surf-udp
          containerPort: 7946
          protocol: UDP
s4mur4i commented 7 years ago

@shashiranjan84 Thanks for the update. tested and works So to recap if I got it correctly the errors and workarounds:

shashiranjan84 commented 7 years ago

@s4mur4i Its Postgres service IP, a virtual ip(cluster ip) mapped to a collection of pods.

https://kubernetes.io/docs/user-guide/services/#environment-variables

aeneasr commented 7 years ago

I think this issue also affects DNS resolution when setting upstream urls that point to services, such as upstream_url=http://my-service:1234 where

apiVersion: v1
kind: Service
metadata:
  name: my-service
  labels:
    name: my-service
spec:
  ports:
    - port: 1234
  selector:
    name: my-service

Is that possible? If so, how can I fix it?

shashiranjan84 commented 7 years ago

@arekkas please try latest Kong rc release. It should be able to resolve internal services.

aeneasr commented 7 years ago

Thanks for the hint, unfortunately I used a forked image and now simply downgraded to 0.9 which didn't need the hacks from above. Unfortunately I'm still seeing

2017/03/07 23:25:48 [error] 83#0: *113 myservice could not be resolved (2: Server failure), client: 10.44.0.1, server: kong, request: "GET /oauth2 HTTP/1.1", host: "104.199.56.105"

but it might very well be some other issue...

shashiranjan84 commented 7 years ago

@arekkas You can log into container and dig/nslookup the service endpoint to make sure it can be resolved to IP and reachable. Also try using FQDN hostname for your service.

aeneasr commented 7 years ago

@shashiranjan84 thanks for the tip! I tried ping (which didn't resolve) ubectl exec kong-415161982-ghc72 -- ping myservice as neither dig nor nslookup seem to be installed on the kong image. I'll try to figure out the FQDN or in general why this isn't working properly. Interestingly enough, I'm seeing the env var

MYSERVICE_SERVICE_HOST=10.47.247.249

edit:// For anyone looking, this stackoverflow question helped!