imperugo / StackExchange.Redis.Extensions

MIT License
612 stars 178 forks source link

Possible memory (or connection) leaking #311

Open sradzyniak opened 4 years ago

sradzyniak commented 4 years ago

We had this library integrated into windows service that servers http requests. We noticed that service started to consume a lot of RAM/CPU resources. Another symptom was that redis cluster (and machine that hosts service) ran out of tcp ports -- all of them were taken by the service. All of this leads to making the entire machine not operable as it all resources (RAM/CPU/TCP ports) are taking by the single service.

To Reproduce All the symptoms above are reproducing when OPS team was maintaining redis cluster. For maintenance they need to take one node out of cluster. And during this period of time service was consuming a lot of RAM/CPU resources and tcp ports on the active redis nodes (as well as on the machine) were exhausted. When redis node were back in cluster CPU was setting down but RAM didn't and during the next maintenance situation become worse and worse. We reproduced this locally by creating some load into the service and taking secondary node in redis cluster up and down multiple times.

Expected behavior Expected behavior is that service doesn't consume such a big amount of resources during maintenance of nodes in redis cluster. In fact the symptoms above describe some kind of leaking as resources are accumulating with subsequent redis cluster maintenances.

Desktop:

Additional context To prove that the problem is in StackExchange.Redis.Extensions library we wrote a simple wrapper on top of StackExchange.Redis library and couldn't reproduce any of the symptoms above.

imperugo commented 4 years ago

Hi @sradzyniak thanks for the feedback. Unfortunately I'm unable to reproduce the issue. If you have a unit test that does it will be amazing. If you give me more info on how to reproduce it, I'll take a look

jesuissur commented 4 years ago

@sradzyniak We had this problem if this might also be your case.

sradzyniak commented 4 years ago

@jesuissur Thank you for your comment! It in-fact looks very similar to the situation that we observed. But in our case ssl options didn't matter. @imperugo I forgot to mention one important detail -- those issues appeared after upgrade from 5.x version into 6.3.3. This seems to be common precondition for our issue and issue described by @jesuissur

@imperugo Unfortunately it's very difficult to write unit test for this as reproducing requires preconditions that we create manually. Please let me know if any other details are needed and I'll try to provide you with them.

imperugo commented 4 years ago

please take a look to 6.3.5 that is on nuget now.

Basically I've removed the invalidate part / reconnect because the multiplex does it automatically. The connection management is like the 5.x version. Let me know

sradzyniak commented 4 years ago

@imperugo Thank you for update! I'll try to test this sometime later. Currently we have a solution that works fine and there is a lot of other stuff to handle. @jesuissur Could you please try the version mentioned by @imperugo and provide your feedback. Seems we experience the same problem so your feedback will be really valuable.

adamhathcock commented 3 years ago

I'm seeing a connection leak behavior with 6.3.5

I saw that a fix was attempted from 6.3.4 but nothing has changed. Not sure if it's the same as the TLS issue or what. Not sure what workarounds to try other than removing the pool.

imperugo commented 3 years ago

Hi, Have you tried the 7.0.1 ?