MQTT Memory Leaks on consumer connections

slice-srinidhis commented 4 months ago

Observed behavior

Seeing consistent memory leak when using NATS while MQTT consumers just makes connections. The pattern is observed when we are scaling the system to have 5K connections .Though there not many messages produced or consumed . The consumer_inactive_threshold: 0.2s is also set but the memory is not getting released. The only way to cleanup is to restart the pods . Screenshot 2024-04-22 at 4 45 58 PM Screenshot 2024-04-24 at 1 01 46 PM On getting the pprof , its observed createInternalClient seem to take a lot of heap memory (attaching the pprof for ref) . [heap_pprof_24-04.pdf](https://github.com/nats-io/nats-server/files/15458528/heap_pprof_24-04.pdf) Kindly look into issue and let know if there are any parameter tweaking that will help resolving the issue .

Expected behavior

Optimal memory management without memory leaks for MQTT .

Server and client version

Nats Server version 2.10.14

Host environment

Kubernetes v1.25

Steps to reproduce

Setup NATS with MQTT , on making client connections the leak is reproducible at scale .

neilalexander commented 4 months ago

Can you please attach the /debug/pprof/allocs?debug=0 file itself instead of the PDF extract?

levb commented 4 months ago

(apologies, posted a comment that was incorrect, deleted)

slice-srinidhis commented 4 months ago

@neilalexander Please find the pprofs associated . 28-05-05-19 taken during the first iteration and 28-05-07-30 during the second one where the older memory is accumulated , Have also attached the memory graph for reference 28-05-07-30.pb.gz

28-05-05-19.pb.gz

levb commented 4 months ago

@slice-srinidhis Do you know the details of the MQTT connections: clean or stored? If stored, can you please provide information about how many subscriptions there are in the sessions? Do you use MQTT retained messages?

slice-arpitkhatri commented 4 months ago

@levb

we have cleanSession set as true.
There is a single subscription per client on topic notifications/{userId} ( QOS 2 )
We are not using MQTT retained messages.

derekcollison commented 4 months ago

Any updates here?

neilalexander commented 4 months ago

@slice-arpitkhatri When the memory usage is quite high, can you please also supply the output of a /debug/pprof/goroutine?debug=1 too?

That output should contain account/asset names etc, so if you would rather send privately vs posting here, then please email to neil@nats.io. Thanks!

slice-arpitkhatri commented 4 months ago

@levb @neilalexander: please find the attached files.

pprof.goroutine.004.pb.gz pprof.goroutine.005.pb.gz pprof.goroutine.003.pb.gz pprof.goroutine.002.pb.gz pprof.goroutine.001.pb.gz

Let us know if you need anything else. Thanks!

levb commented 4 months ago

@slice-arpitkhatri @neilalexander is it possible to get on a zoom call, with access to your cluster, so we could gather more data together?

slice-arpitkhatri commented 4 months ago

Yes, sure. Could you please let me know what times work best for you?

levb commented 4 months ago

@slice-arpitkhatri @neilalexander I can do any time tomorrow June 6 after 6AM Pacific, or Friday any time after 5AM PDT.

levb commented 4 months ago

@slice-arpitkhatri Let's do Friday, June 7, any time that works for you. Please consider @neilalexander is in GMT, I can make it work on my end.. Let us know. You can email me at lev@synadia.com to set up the call.

slice-arpitkhatri commented 4 months ago

Hi @levb @neilalexander, we have tried out the following suggestions that you guys proposed in the last meeting:

Changed inactive_consumer_threshold to 10s.
Tested with QOS 1 instead of QOS 2.

We've conducted performance tests for both cases and have not observed any meaningful changes in memory consumed.

Attaching the memory graphs for the same:

Memory graph with inactive_consumer_threshold : 10s
Memory graph with QOS 1

levb commented 4 months ago

@slice-arpitkhatri What is the motivation to set a "low" inactivity threshold? ("Low" relative to the frequency of messages coming through). Since your clients use clean sessions, the consumers will normally be deleted automatically when the sessions disconnect. If a server cold-restarts and consumers are left undeleted, a considerably longer value (24hrs?) may be acceptable for the cleanup?

I have read through the history of the config option and the code over the weekend, and I am half-through testing/investigating what happens to an MQTT session when its consumers go away "under the hood". There is definitely potential for it getting "confused", but I am not through with the code yet.

Let us know please if setting a "long" inactivity threshold helps avoiding (or slowing down) the leak.

slice-arpitkhatri commented 4 months ago

Hi @levb, we have changed the inactive_consumer_threshold to 24 hours. I have attached the memory graph after making this change. However, we are still facing the memory leak issue.

slice-arpitkhatri commented 4 months ago

We've basically run our tests with different values of inactive_consumer_threshold (0.2s, 10s, 24hrs). We've also run tests after removing the inactive_consumer_threshold from the config. However, we have not observed any changes in memory consumption.

neilalexander commented 4 months ago

Thanks for confirming Arpit, can you please provide updated memory profiles from a period where the memory usage is high? Thanks!

derekcollison commented 4 months ago

@slice-arpitkhatri which mqtt library are you using and could you provide us a small sample mqtt app that shows the behavior? At this point we would want to have a sample app and watch its interactions with the NATS system to help us track down any issues.

Thanks.

slice-arpitkhatri commented 4 months ago

@derekcollison In production, we are using HiveMQ Library for Android and CocoaMQTT Library for iOS. For performance testing purposes, we are using Paho MQTT in Golang. I have attached a sample app which we are using for performance testing.

perfConsumer.go.zip

++ @neilalexander @levb

derekcollison commented 4 months ago

And you can see the issue using the Go client correct?

slice-arpitkhatri commented 4 months ago

We're encountering it regardless of the client, both in production (HiveMQ Kotlin & CocoaMQTT Swift) and during performance testing (Paho MQTT in Golang).

During performance testing, we are only using Paho MQTT in Golang (sample app shared above).

slice-arpitkhatri commented 4 months ago

And you can see the issue using the Go client correct?

To answer your question clearly, yes, we can see the issue using the Go client.

derekcollison commented 4 months ago

Thanks, and during your performance testing, how is that conducted?

slice-arpitkhatri commented 4 months ago

We have 15 pods running, each establishing around ~340 connections (totaling 5k connections, all non-durable, with random clientIDs). They subscribe to "topic/{i}" where 0 < i < 340. These connections are terminated after one minute, at which point the sample app mentioned above spins up another set of 5k connections with different clientIDs.

We're implementing this to simulate the production traffic pattern. Locally, we're running a producer script which publishes messages at 10 TPS to "topic/{i}", where i is a random integer between 0 and 340. Note that this is not 10 TPS per topic; it's 10 TPS collectively, essentially 10 TPS at the broker.

Please let me know in case you have any further queries. Thanks.

derekcollison commented 4 months ago

Thanks for the information, much appreciated.

derekcollison commented 3 months ago

@slice-srinidhis Thank you for your patience. We finally tracked it down and fixed. Is on main and will be part of 2.10.17 release.

slice-srinidhis commented 3 months ago

Thanks for the fix @derekcollison @neilalexander . We have deployed the latest release in production and seeing nats_core_mem_bytes releasing memory and not growing (Graph 1 ) However , the container/pod memory has been growing and not releasing it back to the system (the growth is not as rapid as before) (Graph 2 ) . Can you help us with any parameter that can be tuned so that the pod doesn't go to OOM . We have the GOGC currently set to 50 . 1 - Nats Memory 2-Pod Memory

nats-io / nats-server