djacobs / PyAPNs

Python library for interacting with the Apple Push Notification service (APNs)
http://pypi.python.org/pypi/apns/
MIT License
1.22k stars 377 forks source link

remove __del__ from APNsConnection to prevent memory leak #138

Closed korhner closed 7 years ago

korhner commented 8 years ago

__del__ should not be implemented with circular references as it prevents those objects to be garbage collected. In enhanced mode, this method is never called and it does not allows garbage collector to clean the old objects causing memory to grow in long process applications

jimhorng commented 8 years ago

just for cross-reference: fix #137

alarin commented 8 years ago

and fix #131

ExplodingCabbage commented 8 years ago

I share your concern (mentioned in https://github.com/djacobs/PyAPNs/issues/137) that this will cause connections not to be closed properly.

I think the right way to handle this kind of situation in Python is to make the class a context manager and have users of the library create it using the with statement, but that would be a change to PyAPNs's public API that it would be preferable to avoid. So, is there a way that we can resolve all our objectives simultaneously:

  1. Don't change the library's public API, and
  2. Don't break automatic connection closure when APNsConnection objects get cleaned up by the garbage collector
  3. Fix the memory leak
  4. Fix exceptions being thrown when all objects get garbage collected during termination of the Python process

I think we can solve point 3 without hurting concerns 1 and 2 by modifying the lines where we assign new ErrorResponseHandlerWorkers to self._error_response_handler_worker to pass a weakref.proxy of the GatewayConnection to the ErrorResponseHandlerWorker instead of the GatewayConnection itself, like this:

self._error_response_handler_worker = self.ErrorResponseHandlerWorker(
    apns_connection=weakref.proxy(self)
)

I think that should prevent the circular reference, thus letting the reference count of a GatewayConnection drop to zero even in enhanced-mode, permitting such GatewayConnection objects to be garbage collected, fixing https://github.com/djacobs/PyAPNs/issues/137 without the unwanted side effect of breaking automatic connection closure.

That leaves https://github.com/djacobs/PyAPNs/issues/131. Perhaps the right way to address https://github.com/djacobs/PyAPNs/issues/131 is just to wrap the entire body of the __del__ method in a try:/except: block? Or perhaps we should be setting the global _logger as a self._logger property of every GatewayConnection we ever instantiate to delay its garbage collection? (No idea if this would work; my knowledge of the details of __del__ and Python garbage collection is weak.) Does anybody know how other libraries (or indeed Python internals) handle this problem (beyond avoiding __del__ wherever possible?

jimhorng commented 8 years ago

agree with @ExplodingCabbage , Very solid and detailed analysis and solution to the problems :)

I agree with explicit "clean up" public API for user in either apns.gateway_server.close() or context manager form, instead of relying __del__ to clean up and make it act only as safety net. It might also implies making #131 the logger exception a user error. :)

For memory leak caused by circular reference, I think your solution work, or maybe just removing __del__ to make gc happy?

In this approach, new user should be responsible for explicit close, if existing user not updating to use new explicit close, originally __del__ is not run when process is running which is similar with no __del__ method exists :( the difference might happen at when user stopping the process that will cause the origin __del__ to run, but I think the system will clear up the connections even there's no connection clean up in code in most cases. (I know its not good thing to shift cleaning to system's behavior, just analysis the actual impacts)

just some not-well-tested thoughts.

korhner commented 8 years ago

I like the idea of avoiding circular reference as it seems to give the most benefits for the lowest cost. However, I would go towards avoiding using __del__ whenever possible because:

That being said, I think the solution with keeping __del__ and removing circular reference is our best bet as it keeps backwards compatibility. Perhaps in a new major release you can reconsider and use context manager? I am busy right now to try this fix, could play with it in few weeks if you guys agree about this solution.