python / cpython

The Python programming language
https://www.python.org/
Other
60.25k stars 29.15k forks source link

bug in gc.get_referrers() #35945

Closed 182980c4-e4f4-4e83-8e3e-662d412bbf71 closed 22 years ago

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago
BPO 505453
Nosy @loewis

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields: ```python assignee = 'https://github.com/loewis' closed_at = created_at = labels = ['interpreter-core'] title = 'bug in gc.get_referrers()' updated_at = user = 'https://bugs.python.org/zooko' ``` bugs.python.org fields: ```python activity = actor = 'loewis' assignee = 'loewis' closed = True closed_date = None closer = None components = ['Interpreter Core'] creation = creator = 'zooko' dependencies = [] files = [] hgrepos = [] issue_num = 505453 keywords = [] message_count = 7.0 messages = ['8861', '8862', '8863', '8864', '8865', '8866', '8867'] nosy_count = 2.0 nosy_names = ['loewis', 'zooko'] pr_nums = [] priority = 'normal' resolution = 'fixed' stage = None status = 'closed' superseder = None type = None url = 'https://bugs.python.org/issue505453' versions = ['Python 2.2'] ```

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago

`get_referrers()' can return objects which are already garbage but which are in cycles and haven't been collected yet.

That in itself is a bug, but there is something weirder: in my experiments it only returns *some* of those objects. Perhaps this is indicative of a deeper bug in gc.get_referrers()?

Here is a transcript showing how get_referrers() returns not one referrer (which would be correct), nor three (which would be the behavior if it included all the cyclic garbage, but two:

Python 2.2+ (#6, Jan 17 2002, 12:47:08) 
[GCC 3.0.3] on linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> class A:
...  def __init__(self):
...   self.x = 0
...
>>> a1=A()
>>> a1.x=1
>>> a2=A()
>>> a2.x=2
>>> a1.a=a2
>>> a2.a=a1
>>> a3=A()
>>> a3.x=3
>>> a1.a3=a3
>>> del a1, a2; refs = gc.get_referrers(a3)
>>> len(refs)
2
>>> refs[0]
{'a': <__main__.A instance at 0x8155d54>, 'x': 1, 'a3':
<__main__.A instance at 0x8156d6c>}
>>> refs[1]
{'A': <class __main__.A at 0x815759c>, 'a3':
<__main__.A instance at 0x8156d6c>, 'gc': <module 'gc'
(built-in)>, '__builtins__': <module '__builtin__'
(built-in)>, '__name__': '__main__', 'refs': [{'a':
<__main__.A instance at 0x8155d54>, 'x': 1, 'a3':
<__main__.A instance at 0x8156d6c>}, {...}], '__doc__':
None}

Then in the same session I start again and this time call gc.collect()' before calling gc.get_referrers()', yielding the expected results:

>>> del a3
>>> collect()
0
>>> a1=A()
>>> a1.x=1
>>> a2=A()
>>> a2.x=2
>>> a1.a=a2
>>> a2.a=a1
>>> a3=A()
>>> a3.x=3
>>> a1.a3=a3
>>> del a1, a2; gc.collect(); refs = gc.get_referrers(a3)
4
>>> len(refs)
1
>>> refs[0]
{'A': <class __main__.A at 0x815759c>, 'a3':
<__main__.A instance at 0x8155bec>, 'gc': <module 'gc'
(built-in)>, '__builtins__': <module '__builtin__'
(built-in)>, '__name__': '__main__', 'refs': [{...}],
'__doc__': None}

If nobody else is more inclined to do it, then please let me know and I will investigate this bug.

Also note that I am submitting a patch for gcmodule.c which calls collect_generations()' at the beginning of get_referrers()'. I do not believe that patch should be allowed to close this bug, unless someone can explain the above anomaly.

Regards,

Zooko

--- zooko.com Security and Distributed Systems Engineering ---

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago

Logged In: YES user_id=52562

By the way, you are welcome to add my account `zooko' to the techs list and assign the bug to me, in order to indicate that I should look into it. :-)

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago

Logged In: YES user_id=52562

I submitted a patch: "[ bpo-505464 ] fix (??) bug in `gc.get_referrers()'" which eliminates the symptoms.

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago

Logged In: YES user_id=52562

Whoops. The unexplained behavior is actually perfectly well explained -- only one of those objects *does* reference a3.

I believe that patch "[ bpo-505464 ] fix (??) bug in `gc.get_referrers()'" fixes this bug.

61337411-43fc-4a9c-b8d5-4060aede66d0 commented 22 years ago

Logged In: YES user_id=21627

I fail to see the bug. You may argue that the object is conceptually dead already. However: a) there are different ways to bring it back into life: activating DEBUGSAVEALL may bring them back to life, or they may have references originating from an object with an \_del__, which are put onto gc.garbage always. b) tracing all objects is desirable, since it will help to explain the reference counter, and may help detecting bugs in extension modules. c) always collecting is expensive. d) if the application wants to, it can always initiate a collection itself. e) Initiating a collection may call back into Python code, thus changing many references in unforeseeable ways; this is undesirable if you have managed to bring the interpreter into a state where you want to analyse it, and then the interpreter messes up everything by invoking a collection.

Unless you bring forward arguments why your application cannot initiate a collection itself before invoking get_referrers, I'm tempted to close this report.

182980c4-e4f4-4e83-8e3e-662d412bbf71 commented 22 years ago

Logged In: YES user_id=52562

loewis:

Those are good points.

Hm.

The reason I consider it a bug is that it bit me. I was trying to analyze memory usage in my application, and I got some confusing results, and then finally I recognized one of the objects as a temporary that I had created and destroyed earlier.

Of course, if I had thought explicitly about the question of "does `getreferrers()' include dead but not collected cyclical objects?" I would have known the answer correctly, but I didn't think about it and just acted as though things that get de-linked (given that they had no \_del__ methods) are effectively gone forever.

But you made very good points, so I suggest changing the documentation, something like this:

Index: dist/src/Doc/lib/libgc.tex \===================================================================

RCS file: /cvsroot/python/python/dist/src/Doc/lib/libgc.tex,v
retrieving revision 1.9
diff -u -d -r1.9 libgc.tex
--- dist/src/Doc/lib/libgc.tex  2001/12/14 21:19:08     1.9
+++ dist/src/Doc/lib/libgc.tex  2002/01/19 10:40:31
@@ -83,6 +83,11 @@
 function will only locate those containers which support
garbage
 collection; extension types which do refer to other objects
but do not
 support garbage collection will not be found.
+
+Note that objects which have already been dereferenced, but
which
+live in cycles and have not yet been collected by the
garbage collector
+can be listed among the resulting referrers.  To get only
currently live
+objects, call \function{collect()} before calling
\function{get_referrers()}.
 \versionadded{2.2}
 \end{funcdesc}
61337411-43fc-4a9c-b8d5-4060aede66d0 commented 22 years ago

Logged In: YES user_id=21627

Thanks for the patch. Committed as libgc.tex 1.10; ACKS 1.158.