Memory leak on Interceptor

Leonardo-Ferreira commented 5 months ago

Report

I think we have a memory leak. Here is the memory usage over 2 days:

The memory drops is the pod getting OOM killed. At first we thought 64mb is too little, so we decided to experiment higher values in the last 24hrs, but the symptom persisted:

Expected Behavior

the pods should release memory and as traffic fluctuates down and not get OOM killed

Actual Behavior

Memory usage never decreases

Steps to Reproduce the Problem

Install version 0.8
Put a lot of load on apis controlled by this guy

Logs from KEDA HTTP operator

example

HTTP Add-on Version

0.8.0

Kubernetes Version

< 1.28

Platform

Microsoft Azure

Anything else?

AKS v 1.27.9

tomkerkhove commented 4 months ago

@JorTurFer / @wozniakjan do you have bandwidth to check this one?

wozniakjan commented 4 months ago

thank you for reporting this, from the graphs this indeed looks like a memory leak, I can investigate later today.

wozniakjan commented 4 months ago

after a quick investigation, this might actually be related to HTTP connections overhead, GC and request caching rather than an actual memory leak. I am able to reproduce quick memory growth with concurrent requests. When idle on a tiny cluster with a single HTTPScaledObject, the interceptor uses around 21Mi

$ kubectl top pods -nkeda --selector app.kubernetes.io/component=interceptor
NAME                                            CPU(cores)   MEMORY(bytes)
keda-add-ons-http-interceptor-6f6d78bd9-rkn2r   4m           21Mi

with requests sent sequentially, the memory usage barely increases and then goes down to idle levels. With simple benchmark using wrk with 100 concurrent threads and 100 connections

$ wrk -t100 -c100 -d30s -H "Host: myhost2.com" http://172.19.0.2

the memory usage does increase, peaking at 54Mi:

$ kubectl top pods -nkeda --selector app.kubernetes.io/component=interceptor
NAME                                            CPU(cores)   MEMORY(bytes)
keda-add-ons-http-interceptor-6f6d78bd9-rkn2r   494m         54Mi

but after wrk benchmark terminates, the CPU usage drops down quickly and while the memory usage decreases rather slowly, both eventually reach similar levels to the initial idle. With 100 threads and 500 connections, interceptor OOMKills pretty fast with a default limit of 64Mi, after bumping the memory limit to 1Gi, the memory again seems well flexible - growing as well as shrinking with benchmark executions.

Attached is also trace output from net/http/pprof during the execution of the wrk benchmark. The memory growth is apparent but so is also the memory decrease. There doesn't appear to be any place in code where the memory just grows. I will perform some longer running benchmarks with profiling traces to try to spot something more sneaky.

trace-wrk.txt

wozniakjan commented 4 months ago

I may have noticed something, with higher error rate in the benchmark, the memory consumption doesn't go down. I just bumped the memory limit for interceptor to 1Gi and executed benchmark that may have been too much, 5000 connections in 200 threads for 10 minutes

$ wrk -t200 -c5000 -d10m -H "Host: myhost2.com" http://172.19.0.2
Running 10m test @ http://172.19.0.2
  200 threads and 5000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   688.19ms  506.07ms   2.00s    69.41%
    Req/Sec     6.19      5.99   101.00     83.53%
  351661 requests in 10.00m, 277.64MB read
  Socket errors: connect 0, read 0, write 0, timeout 323052
  Non-2xx or 3xx responses: 16769
Requests/sec:    586.00
Transfer/sec:    473.75KB

out of ~350000 requests, 91% timeouts and 4.7% errors. The memory usage is still at 531Mi which is close to its ~600Mi peak during the benchmark even though it's been idle for 10 minutes, and it doesn't seem to go down.

Leonardo-Ferreira commented 4 months ago

@wozniakjan my current number is around 15k requests every minute. I currently have 3 interceptor pods, at 640 MB each... in about one hour and half that number is expected to peak at 25k... during the late night, between 2am-5am, the number drops to 5k...

wozniakjan commented 4 months ago

@Leonardo-Ferreira I played around with benchmarks a bit more and afaict interceptor seems to be very CPU intensive, for wrk simulated load that would match the request rate of 25k/min I am able to saturate 12 CPU cores easily. With this amount of CPU, it's able to maintain a relatively low error rate 0.42%.

With a low error rate, the memory usage appears to go down to idle levels eventually, but when the error rate is high, even when later the interceptor is off traffic, memory usage doesn't seem to go down.

Go's pprof heap analysis doesn't point to any code that would accumulate memory and not release it, but also heap as counted by pprof appears to be just a fraction of memory used compared to cgroup accounting. The pprof recorded 44MB while cgroup show 318MB used.

Leonardo-Ferreira commented 4 months ago

@wozniakjan after looking deeply at my data, I could not correlate the memory usage/increase-rate to the number of errors. Luckily (or not) we had an issue this morning where a dependency was flaky so there was a significant spike in errors related to timeouts. on the memory consumption graph there was no significant change during the 1 hour long event...

Leonardo-Ferreira commented 4 months ago

simulated load that would match the request rate of 25k/min I am able to saturate 12 CPU cores easily

well, that's weird... our pods have a limit of 0.7

this is the last 24hrs looking at the 95th percentile of cpu usage

wozniakjan commented 4 months ago

well, that's weird... our pods have a limit of 0.7

good to hear that, perhaps my artificial benchmark setup just ends up being CPU-intensive.

my current hypothesis I am trying to either prove or disprove is that maybe when the interceptor itself fails to process the requests and it has a higher error rate, maybe then it leaves some resources dangling, e.g. open connections or entries in routing tables caches..

Leonardo-Ferreira commented 4 months ago

would that output a specific log message that I could query for?

wozniakjan commented 4 months ago

I can see plenty of these access logs in the interceptor

2024-07-01T14:26:38Z    INFO    LoggingMiddleware   10.42.0.8:35634 - - [01/Jul/2024:14:26:36 +0000] "GET / HTTP/1.1" 502 11 "" ""

despite the fact the scaled app returns always OK 200.

Also, I can observe these errors in the interceptor logs

2024-07-01T14:27:13Z    ERROR   LoggingMiddleware.RoutingMiddleware.CountingMiddleware.UpstreamHandler.StaticHandler    Bad Gateway {"routingKey": "//myhost2.com/", "namespacedName": {"name":"demo","namespace":"keda"}, "stream": "http://demo.keda:8080/", "error": "context canceled"}
github.com/kedacore/http-add-on/interceptor/handler.(*Static).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/handler/static.go:36
github.com/kedacore/http-add-on/interceptor/handler.(*Upstream).ServeHTTP.func2
    github.com/kedacore/http-add-on/interceptor/handler/upstream.go:54
net/http/httputil.(*ReverseProxy).ServeHTTP
    net/http/httputil/reverseproxy.go:472
github.com/kedacore/http-add-on/interceptor/handler.(*Upstream).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/handler/upstream.go:57
main.runProxyServer.newForwardingHandler.func2
    github.com/kedacore/http-add-on/interceptor/proxy_handlers.go:87
net/http.HandlerFunc.ServeHTTP
    net/http/server.go:2166
github.com/kedacore/http-add-on/interceptor/middleware.(*Counting).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/middleware/counting.go:39
github.com/kedacore/http-add-on/interceptor/middleware.(*Routing).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/middleware/routing.go:93
github.com/kedacore/http-add-on/interceptor/middleware.(*Logging).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/middleware/logging.go:42
github.com/kedacore/http-add-on/interceptor/middleware.(*Metrics).ServeHTTP
    github.com/kedacore/http-add-on/interceptor/middleware/metrics.go:24
net/http.serverHandler.ServeHTTP
    net/http/server.go:3137
net/http.(*conn).serve
    net/http/server.go:2039

Leonardo-Ferreira commented 4 months ago

should we log some extra info and release a 0.8.1 version? unfortunately my security dpto does not allow me to use "unofficial" images in production

wozniakjan commented 4 months ago

I'm still in the process of trying to figure out the root cause but as soon as I have some valid hypothesis to test, I can distribute a build for your testing

Leonardo-Ferreira commented 4 months ago

any news @wozniakjan, can I help somehow?

wozniakjan commented 4 months ago

I didn't get very far and now I had to put it on hold temporarily, but I will get back to this soon

Leonardo-Ferreira commented 4 months ago

hey @wozniakjan, I'd like to contribute here. would you be willing to connect for like 30-45min so you can "boost" me? this way I could make significant contributions faster

wozniakjan commented 4 months ago

sure, you can ping me on KEDA slack https://keda.sh/community/ https://kubernetes.slack.com/team/UF5R8HHB7

stale[bot] commented 2 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

stale[bot] commented 1 month ago

This issue has been automatically closed due to inactivity.

stale[bot] commented 1 month ago

This issue has been automatically closed due to inactivity.

Leonardo-Ferreira commented 4 weeks ago

@zroubalik reopen?

kedacore / http-add-on