Open afilipchik opened 9 years ago
Hello Alex,
we too suffer from the fact that Astyanax doesn't close connections from time to time, especially when dealing with larger payloads over thrift. This will result in large TFramedTransport instances on the heap which will never go away.
Thomas
@afilipchik Adding reaping is a non-trivial change given the addition of a thread pool to manage (making sure it plays nice with containers, stand-alone apps, etc.).
I don't think it's a bad idea, but is there a more immediate fix of noodling the specific connection error states correctly, or, conversely, validating payloads tighter, that would be easier to drop in?
Regardless, if you submit a patch(s), I'll be happy to try them out as well given that @opuneet is usually doing like 26 things simultaneously over there :)
I can do a PR
but is there a more immediate fix of noodling the specific connection error states correctly
That is the thing - all connections are live and happy. Number grows when load spikes and stays there forever. Another option (and probably easier) will be to make RoundRobin and TokenAware pools to honor maxOperationsPerConnection(),
It is already in our prod and looks good :) . Number of live connections went down from 2000 to 600 in one case and from 800 to 200 in a second case.
Hi , has this issue been fixed in any of the versions of astyanax client library ?
Given the fact that astyanax doesn't shrink it's connection pools and there is https://issues.apache.org/jira/browse/THRIFT-1457 it creates a possibility for a nasty memory leak. In our apps I saw pools consuming up to 8Gb of heap and actually doing nothing.
The solution could be to automatically shrink connection pool by closing old and idle connections. The naive implementation could be:
Connection
ThriftConnection
SimpleHostConnectionPool
AbstractHostPartitionConnectionPool