mgedmin / objgraph

Visually explore Python object graphs
http://mg.pov.lt/objgraph/
MIT License
753 stars 72 forks source link

show_growth prints incorrect results if peak_stats not passed #48

Closed jessehersch closed 4 years ago

jessehersch commented 4 years ago

the problem

show_growth() relies on the strange behavior of taking a mutable as the default value of an arg. That leads to a bug where incorrect results are returned. At least they are incorrect to my understanding. Could be my understanding is wrong though :)

See repro below.

objgraph version: 3.4.1

repro:

import objgraph

class Something:
    def __init__(self, a, b, c):
        self.a = a
        self.b = b
        self.c = c

def main():
    works()
    broken()

def works():
    # if pass peak_stats it works better:
    print('\n------------this works-------------')
    peak_stats = {}
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings now. first one is {x[0]}')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    del x
    print(f'\nsomethings should be gone now:')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings again, and we do. first one is {x[0]}')
    objgraph.show_growth(shortnames=False, peak_stats=peak_stats)
    peak_stats = objgraph.typestats(shortnames=False)

def broken():
    print('\n------------this is broken-------------')
    objgraph.show_growth(shortnames=False)
    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings now. first one is {x[0]}')
    objgraph.show_growth(shortnames=False)

    del x
    print(f'\nsomethings should be gone now:')
    objgraph.show_growth(shortnames=False)

    x = [Something(i, i ** 2, str(i)) for i in range(1000)]
    print(f'\nshould have 1000 somethings again, but we do not. first one is {x[0]}')
    #
    # the bug is provoked on this call.
    #
    # it should print a line with "__main__.Something     1000     +1000"
    #
    # instead it prints nothing. I think this is due to relying on the strange behavior
    # of specifying a mutable as a default value for an arg, in this case the peak_stats arg of show_growth()
    #
    objgraph.show_growth(shortnames=False)
    print(x[0])

if __name__ == "__main__":
    main()

which produces something like this:

------------this works-------------
builtins.function                       2261     +2261
builtins.tuple                          1885     +1885
builtins.dict                           1425     +1425
builtins.wrapper_descriptor              998      +998
builtins.weakref                         894      +894
builtins.set                             805      +805
builtins.method_descriptor               732      +732
builtins.builtin_function_or_method      702      +702
builtins.list                            551      +551
builtins.getset_descriptor               408      +408

should have 1000 somethings now. first one is <__main__.Something object at 0x7fdcef2726a0>
__main__.Something     1000     +1000
builtins.frame            7        +2
builtins.list           552        +1
builtins.cell            42        +1

somethings should be gone now:
builtins.frame        7        +2
builtins.cell        42        +1

should have 1000 somethings again, and we do. first one is <__main__.Something object at 0x7fdcee6986a0>
__main__.Something     1000     +1000
builtins.frame            7        +2
builtins.list           552        +1
builtins.cell            42        +1

------------this is broken-------------
builtins.function                       2261     +2261
builtins.tuple                          1864     +1864
builtins.dict                           1425     +1425
builtins.wrapper_descriptor              998      +998
builtins.weakref                         894      +894
builtins.set                             805      +805
builtins.method_descriptor               732      +732
builtins.builtin_function_or_method      702      +702
builtins.list                            551      +551
builtins.getset_descriptor               408      +408

should have 1000 somethings now. first one is <__main__.Something object at 0x7fdcedfabbe0>
__main__.Something     1000     +1000
builtins.list           552        +1

somethings should be gone now:

should have 1000 somethings again, but we do not. first one is <__main__.Something object at 0x7fdcedf3b8d0>
<__main__.Something object at 0x7fdcedf3b8d0>
mgedmin commented 4 years ago

This is actually working broken working as designed, because show_growth() shows

Show the increase in peak object counts since last call.

(emphasis mine)

If you change your broken() function to create 1002 somethings in the second list, you'll see

...
------------this is broken-------------
builtins.function                       1793     +1793
builtins.wrapper_descriptor             1039     +1039
builtins.dict                            929      +929
builtins.builtin_function_or_method      792      +792
builtins.method_descriptor               751      +751
builtins.weakref                         581      +581
builtins.tuple                           544      +544
builtins.getset_descriptor               372      +372
builtins.member_descriptor               303      +303
builtins.type                            145      +145

should have 1000 somethings now. first one is <__main__.Something object at 0x7f3afbb50110>
__main__.Something     1000     +1000
builtins.list           126        +1

somethings should be gone now:

should have 1000 somethings again, but we do not. first one is <__main__.Something object at 0x7f3afb9c8750>
__main__.Something     1002        +2
<__main__.Something object at 0x7f3afb9c8750>

IIRC I chose to do this because I was mostly interested in hunting memory leaks (objects that never go away), and objects where the counts go up and down all the time but never go beyond a certain fixed peak level used to produce too much noise in the output.

I wish I were rich and could hire a technical writer to make the documentation clearer...

jessehersch commented 4 years ago

understood. from my perspective showing the current counts is more useful, but I can implement that myself easily enough.

I didn't realize until now that it was showing peak counts, although I should have realized it since the param in growth() that takes the mutable as a default is called peak_stats :)

jessehersch commented 4 years ago

actually not sure I agree that is working correctly because once 1000 somethings are created again after all have been destroyed, then delta should be +1000 again. but it's not. it's simply missing. If that's by design, ok. that behavior is strange to me though.

mgedmin commented 4 years ago

The delta shown is always relative to the last peak value.