SoftwareUnderstanding / inspect4py

Static code analysis package for Python repositories
https://inspect4py.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
27 stars 10 forks source link

Does it use Tree Sitter AST? #426

Open smith-co opened 1 year ago

smith-co commented 1 year ago

Does it use Tree Sitter AST?

https://github.com/tree-sitter/tree-sitter

dgarijo commented 1 year ago

We do use the tree_sitter python library

smith-co commented 1 year ago

But quite confused as README mentions it use python AST module: https://docs.python.org/3/library/ast.html

So which is it?

If I can give the inspect4py a tree sitter AST will it be able to generate control flow as an example?

dgarijo commented 1 year ago

I do not think we support passing ASTs to the tool. We used to have a control flow option based on cdmcfparser but we ended up removing it because it requires active maintenance for newer Python versions

smith-co commented 1 year ago

I am quite confused what is supported for Python 3. According to the paper, we can extract:

Also what do you use tree_sitter python library for as it appears that this also uses this AST module.

dgarijo commented 1 year ago

@smith-co, The paper was written in 2021. In the paper, we added a control flow graph serialization based on https://pypi.org/project/cdmcfparser/. We also allowed inspecting the control flow (made using CFGBuilder) in a png. We used the ast library for processing the abstract syntax trees.

Unfortunately, there is no support for cdmcfparser beyond Python 3.9, and we dropped this option in release https://github.com/SoftwareUnderstanding/inspect4py/releases/tag/v0.0.3. If you want a control-flow representation, I invite you to test the first two releases of the tool.

About the call graph: For each file in the repo, we return all its functions. For each function, we return a list of the functions/methods that are being invoked, as shown in the example below: image

We also return the ast representation for each function, which is a recent addition using tree sitter. Why do we have tree sitter and ast? Well, the last one may have been an oversight on my end, as I tasked one of my students with serializing the ast implementation and he did not reuse what we had done inside the library. I opened an issue: https://github.com/SoftwareUnderstanding/inspect4py/issues/427 to look into this.