scottrogowski / code2flow

Pretty good call graphs for dynamic languages
MIT License
3.98k stars 295 forks source link

Request: distinguish functions with the same name but in different files #68

Open Darkbblue opened 2 years ago

Darkbblue commented 2 years ago

Hello! I tried to use code2flow on some python projects, And found it cannot distinguish files with the same name but in different directories, or functions with the same name but in different files. Also it becomes a mess if the two cases happen in the same time.

I also looked into the source code. The first problem can be fixed by changing line 348 in engine.py to include the full path into group name. But the second one seems much more difficult. I guess what you get from an ast Call node is only the function name, not enough to tell which function it is if there are more than one function with the same name in the project. But maybe we can use import statement to do this. But to do this, we need to relate the module name in import statement with the corresponding file full path.

Maybe you can check this problem and see if there's another solution? I think this is not a rare case so a solution is needed.

scottrogowski commented 2 years ago

Could you provide some example source? Ideally simplified?

My guess is that this is expected behavior and falls under

"Functions with identical names in different namespaces are (loudly) skipped. E.g. If you have two classes with identically named methods, code2flow cannot distinguish between these and skips them."

but willing to investigate

scottrogowski commented 2 years ago

The reasoning is that it is impossible to statically analyze a dynamic language and know its type before runtime. So it is theoretically impossible to determine when you are calling a.func() whether func is Foo.func() or Bar.func()

Darkbblue commented 2 years ago

Could you provide some example source? Ideally simplified?

My guess is that this is expected behavior and falls under

"Functions with identical names in different namespaces are (loudly) skipped. E.g. If you have two classes with identically named methods, code2flow cannot distinguish between these and skips them."

but willing to investigate

I've uploaded two examples to my forked repo (test/ and test2/) https://github.com/Darkbblue/code2flow/tree/master/code2flow
Also i tried to append full path to groups (files) / subgroups / function definitions. It seemed to be working like this:
Code2Flow: Found groups ['File: code2flow/test2/file1', 'File: code2flow/test2/secondary/file2', 'File: code2flow/test2/secondary/file3']. Code2Flow: Found nodes ['code2flow/test2/file1.(global)', 'code2flow/test2/file1.func_same', 'code2flow/test2/secondary/file2.(global)', 'code2flow/test2/secondary/file2.func', 'code2flow/test2/secondary/file3.(global)', 'code2flow/test2/secondary/file3.func_same']. Code2Flow: Found calls ['func()', 'func_same()', 'print()']. Code2Flow: Found variables ['func->UNKNOWN_MODULE', 'func_same->UNKNOWN_MODULE']. Code2Flow: No functions found! Most likely, your file(s) do not have functions that call each other. Note that to generate a flowchart, you need to have both the function calls and the function definitions. Or, you might be excluding too many with --exclude-* / --include-* / --target-function arguments.

But you can see i didn't do the same thing to function calls and i don't know how.

lofidevops commented 1 year ago

FWIW in my testing pyan3 manages to distinguish functions with the same name. I can't comment on implementation details. The most reliable fork of pyan3 I've found is https://github.com/maciejczyzewski/pyan although the PyPI version is maintained here https://github.com/maciejczyzewski/pyan