gak / pycallgraph

pycallgraph is a Python module that creates call graphs for Python programs.
GNU General Public License v2.0
1.82k stars 336 forks source link

Handling of "anonymous" code (think lambdas, generator expressions, etc.) #156

Open daviddelaharpegolden opened 8 years ago

daviddelaharpegolden commented 8 years ago

Python code objects (at least in python 2.7) use some special non-unique co_name values for "unnamed" code chunks - like lambdas (<lambda>) and generator expressions (<genexpr>). As pycallgraph's TraceProcessor is using the co_name to make the full_name, that easily results in bad output where such language features have been used multiple times in one module.

So, given, say:

def alpha(arg):
   print arg

def beta(arg):
   (lambda x: alpha(x))(arg)

def gamma(arg):
   alpha(arg)

def delta(arg):
   (lambda x: beta(x))(arg) ; (lambda x: gamma(x))("World!")

if __name__ == '__main__':
    delta("Hello,")

doing a pycallgraph -ng graphviz anoncg.py you get a graph like:

pycallgraph_lambdaconfused

Note how it misleadingly looks like they're all one lambda.

As an immediate workaround, I tried just adding the id() of the code object a the end of the full_name (co_firstlineno might be nice to show for people to identify which lambda is which, but by itself also can't fully disambiguate) i.e.

full_name = full_name + '[' + hex(id(code)) + ']'

That appears to be enough to distinguish the identically named but distinct lambda code objects to show a more accurate and helpful graph, but is not entirely aesthetically pleasing in the output as-is:

pycallgraph_withcodeid