bentsherman / codeflow

Flow graphs for your code
0 stars 0 forks source link

Add full Python language support #1

Closed bentsherman closed 2 years ago

bentsherman commented 2 years ago

The abstract grammar for Python is defined here. We don't need to define a handler for every single node type, just the ones that will be relevant to the flow graphs.

For example, we don't necessarily need to parse out every piece of a complex expression. It should be enough to put the entire expression in a single node. So there is a broader consideration how fine-grained or coarse-grained the flow graphs should be.

Fortunately, the AST walker can (for the most part) simply skip over nodes that it can't process, so we can gradually add support as we go.

bentsherman commented 2 years ago

ControlFlowGraph is now a subclass of ast.NodeVisitor, which guarantees that it will traverse the entire AST, except in the case of custom visitor methods. We have implemented a visitor method for every statement type in the abstract grammar, which creates at least one control flow node for each statement.

Most expression types don't need custom visitor methods because they are embedded in a statement one way or another. The only exception currently is Call, which adds caller/callee edges to the control flow graph. Any other custom visitor methods for expressions will serve a similar purpose, i.e. to add other types of edges.

For the data flow graph, we will likely need much more custom logic in order to track inputs and outputs of each task.