Kong / kong

🦍 The Cloud-Native API Gateway and AI Gateway.
https://konghq.com/install/#kong-community
Apache License 2.0
38.89k stars 4.78k forks source link

Memory leak due to reloading config in DB-less mode #6547

Closed Jorgevillada closed 3 years ago

Jorgevillada commented 3 years ago

Summary

I am currently using kong in db-less mode with kong-ingress-controller. there is a limit of 1.5GB of ram (not sure if it is enough) for each kong proxy. there are 3 in total (I have also tried with a single replica and with 2, with the same result).

I think it is related to https://github.com/Kong/kong/issues/6055, this change is not found in master. https://github.com/Kong/kong/commit/0eb15e65e0290967341108e3052e69462f1f0b81

it always starts with 180-200mb and with every update from ingress-controller it increases constantly. image

I have configured these parameters in kong-proxy.

- name: KONG_NGINX_MAIN_WORKER_RLIMIT_NOFILE
  value: '4096'
- name: KONG_NGINX_EVENTS_WORKER_CONNECTIONS
  value: '4096'
- name: KONG_WORKER_STATE_UPDATE_FREQUENCY
  value: '3600'
- name: KONG_WORKER_CONSISTENCY
  value: eventual

in ingress-controller I don't know if they work, but the behavior has improved.

- name: CONTROLLER_SYNC_PERIOD
  value: 3600s
- name: CONTROLLER_SYNC_RATE_LIMIT
  value: '0.9'

image now it takes 24 hours to reach the memory limit(before it took 12 hours)

Steps To Reproduce

1.run kong in db-less mode

docker run --rm --name kong --net=host \
-e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_GUI_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_GUI_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_LISTEN=127.0.0.1:8444 http2 ssl" \
-e "KONG_CLUSTER_LISTEN=off" \
-e "KONG_DATABASE=off" \
-e "KONG_KIC=on" \
-e "KONG_LUA_PACKAGE_PATH=/opt/?.lua;/opt/?/init.lua;;" \
-e "KONG_NGINX_WORKER_PROCESSES=1" \
-e "KONG_PLUGINS=bundled" \
-e "KONG_PORTAL_API_ACCESS_LOG=/dev/stdout" \
-e "KONG_PORTAL_API_ERROR_LOG=/dev/stderr" \
-e "KONG_PORT_MAPS=80:8000, 443:8443" \
-e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
-e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
-e "KONG_PROXY_LISTEN=0.0.0.0:8000, 0.0.0.0:8443 http2 ssl" \
-e "KONG_STATUS_LISTEN=0.0.0.0:8100" \
-e "KONG_STREAM_LISTEN=off" \
-e "KONG_LOG_LEVEL=debug" \
-e "KONG_NGINX_DAEMON=off" \
-e "KONG_TRUSTED_IPS=0.0.0.0/0,::/0" \
kong:2.2.0 
  1. Create a kong.yaml with the content of this gist. (this configuration is taken from https://localhost:8444/config in prod, just change usernames and api-keys, namespaces and domain)
  2. run this command
    for i in $(seq 1 1000)
    do
    curl -k --http1.1 -s -o /dev/null  -X POST 'https://localhost:8444/config' --form 'config=@kong.yaml'
    date && docker exec kong ps -o pid,rss,comm,args
    sleep 10
    done

    result with empty config

    
    date && docker exec kong ps -o pid,rss,comm,args

Thu 05 Nov 2020 02:37:41 PM -05 PID RSS COMMAND COMMAND 1 30m nginx nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /usr/local/kong -c nginx.conf 21 86m nginx nginx: worker process 22 4 ps ps -o pid,rss,comm,args

first request with config
```sh
Thu 05 Nov 2020 02:38:06 PM -05
PID   RSS  COMMAND          COMMAND
    1  30m nginx            nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /usr/local/kong -c nginx.conf
   21 168m nginx            nginx: worker process
   27    4 ps               ps -o pid,rss,comm,args

1 hour after

Thu 05 Nov 2020 03:38:50 PM -05
PID   RSS  COMMAND          COMMAND
    1  29m nginx            nginx: master process /usr/local/openresty/nginx/sbin/nginx -p /usr/local/kong -c nginx.conf
   21 1.6g nginx            nginx: worker process
 1675    4 ps               ps -o pid,rss,comm,args

I only make this request, I do not make any request to port 8000. request to status

curl -k --http1.1 -vvv  'https://localhost:8444/status'

response: kong_status.json

"workers_lua_vms":[
         {
            "http_allocated_gc":"71.22 MiB",
            "pid":21
         }
      ]

Additional Details & Logs

bungle commented 3 years ago

Yes, I can reproduce this with latest next using:

for i in $(seq 1 1000)
do
  curl -k -s -o /dev/null --http1.1 -X POST 'https://localhost:8444/config' --form 'config=@kong_prod.yaml'
  ps -Ao pid,rss,comm,args | grep nginx
done

And I started my local Kong with:

KONG_NGINX_MAIN_WORKER_RLIMIT_NOFILE=4096 \
KONG_NGINX_EVENTS_WORKER_CONNECTIONS=4096 \
KONG_WORKER_STATE_UPDATE_FREQUENCY=3600 \
KONG_WORKER_CONSISTENCY=eventual \
KONG_ADMIN_LISTEN="127.0.0.1:8444 http2 ssl" \
KONG_PROXY_LISTEN="0.0.0.0:8000, 0.0.0.0:8443 http2 ssl" \
KONG_STATUS_LISTEN="0.0.0.0:8100" \
KONG_CLUSTER_LISTEN=off \
KONG_STREAM_LISTEN=off \
KONG_DATABASE=off \
KONG_NGINX_MAIN_WORKER_PROCESSES=1 \
KONG_TRUSTED_IPS="0.0.0.0/0,::/0" \
KONG_PLUGINS=bundled \
./bin/kong start

Initial investigation shows that pending timers grow. This is probably a leak in timers for DNS client and/or healthchecks.

bungle commented 3 years ago

The timer leak is gone if I remove upstreams and targets from yaml, but memory leak seems still be present even then.

bungle commented 3 years ago

I have found the place where this occurs, and this is the line: https://github.com/Kong/kong/blob/next/kong/db/schema/init.lua#L886

The starting point of the leak is this: https://github.com/Kong/kong/blob/next/kong/db/schema/others/declarative_config.lua#L628