Closed dapollak closed 2 years ago
Hi @dapollak
If I recall how close
and the different classes work then I do think that this is a bug.
Why doesn't Python's garbage collector not find the circular references though?
@sfc-gh-mkeller, thanks for the quick response !
As I understand, the gc doesn't release objects involved in a circular referencing if they have __del__
method.
@dapollak I tried reproducing your issue, on top of master
I applied this small patch that adds freeing print lines:
diff --git a/src/snowflake/connector/connection.py b/src/snowflake/connector/connection.py
index 15dff139..56a6067f 100644
--- a/src/snowflake/connector/connection.py
+++ b/src/snowflake/connector/connection.py
@@ -287,6 +287,7 @@ class SnowflakeConnection(object):
self.incident = IncidentAPI(self._rest)
def __del__(self): # pragma: no cover
+ print(f"freeing {self}")
try:
self.close(retry=False)
except Exception:
diff --git a/src/snowflake/connector/incident.py b/src/snowflake/connector/incident.py
index a800a273..38837579 100644
--- a/src/snowflake/connector/incident.py
+++ b/src/snowflake/connector/incident.py
@@ -108,6 +108,9 @@ class IncidentAPI(object):
def __init__(self, rest):
self._rest = rest
+ def __del__(self):
+ print(f"freeing {self}")
+
def report_incident(
self, incident=None, job_id=None, request_id=None, session_parameters=None
):
diff --git a/src/snowflake/connector/telemetry.py b/src/snowflake/connector/telemetry.py
index 5d833f96..c4172545 100644
--- a/src/snowflake/connector/telemetry.py
+++ b/src/snowflake/connector/telemetry.py
@@ -73,6 +73,9 @@ class TelemetryClient(object):
self._lock = Lock()
self._enabled = True
+ def __del__(self):
+ print(f"freeing {self}")
+
def add_log_to_batch(self, telemetry_data: "TelemetryData") -> None:
if self._is_closed:
raise Exception("Attempted to add log when TelemetryClient is closed")
Then I ran the following script:
import gc
import snowflake.connector
conn = snowflake.connector.connect(
...
)
# conn._telemetry._rest = None
del conn
num = gc.collect()
print(num)
In this case gc.collect correctly finds all 3 objects and frees them all. Is it possible that you've disabled reference cycle detection? I'll try to dig deeper into this in the meantime
As I understand, the gc doesn't release objects involved in a circular referencing if they have
__del__
method.
From what I could find, this used to be true but isn't since PEP-442
@sfc-gh-mkeller I see.
Another thing I noticed that may prevent the Connection object from being release is the HeartBeatTimer
.
in SnowflakeConnection
, you create timer (inside self._add_heartbeat
) which its thread is a method of SnowflakeConnection
:
self.heartbeat_thread = HeartBeatTimer(
self.client_session_keep_alive_heartbeat_frequency, self._heartbeat_tick
)
If the timer isn't close properly, the reference to self._heartbeat_tick
will prevent the gc of releasing the connection.
In HeartBeatTimer
under time_util.py you override the self.run
method of the built-in timer:
def run(self) -> None:
while not self.finished.is_set():
self.finished.wait(self.interval)
if not self.finished.is_set():
try:
self.function()
except Exception as e:
logger.debug("failed to heartbeat: %s", e)
but it seems you don't call self.finished.set()
like in python's built-in Timer, which its self.run
method looks like -
def run(self):
self.finished.wait(self.interval)
if not self.finished.is_set():
self.function(*self.args, **self.kwargs)
self.finished.set()
Try to create the connection with self.client_session_keep_alive
set, i guess it's created without it in your reproduce example (and correct me otherwise)
Okay, that's it. The following script reproduces the issue:
import gc
import snowflake.connector
conn = snowflake.connector.connect(
...
client_session_keep_alive=True,
)
print(f"gonna delete {conn}")
del conn
print("going to force garbage collection")
num = gc.collect()
I'm going to fix this issue
@dapollak could you please see if 98eb923
(#1031) fixes your issue, it fixes my reproducer script so hopefully it'll work for you as well
@sfc-gh-mkeller fixes for me either. Good job !
Please answer these questions before submitting your issue. Thanks!
What version of Python are you using?
3.8.5
What operating system and processor architecture are you using?
macOS-10.16-x86_64-i386-64bit
What are the component versions in the environment (
pip freeze
)?Hey I have a flask server, which communicates Snowflake database through sqlalchemy & snowflake connector. Recently I'm seeing that I have a memory leak (memory increases constantly after every http request that does database operation). After digging in, i found that
SnowflakeConnection
object is not being released, and a new one is created every other request.I looked on
SnowflakeConnection
code, and saw that:SnowflakeConnection.__init__
(https://github.com/snowflakedb/snowflake-connector-python/blob/1659ec6b78930d1f947b4eff985c891af614d86c/src/snowflake/connector/connection.py#L286), you assign aSnowflakeRestful
object toself._telemetry._rest
andself.incident._rest
.SnowflakeRestful
points back toself
(SnowflakeConnection
)self.close()
, we don't put None inself._telemetry._rest
andself.incident._rest
, so there are circular references left in theSnowflakeConnection
object which make it stay in memory.It looks like when I break this circularity manually, The object is being released successfully. Am I missing something, or it's really a bug ?
Thanks !