Closed melvinw closed 1 month ago
Also it would be nice to have some technical design info about the control flow graph, either in some docstrings or a separate markdown file
Is there one "obvious" kind of control flow graph? Or are there are different ones that are appropriate for different analyses?
What properties does the graph obey?
I have not worked with them very much -- I only refactored the Python bytecode generator in opy/
. There I think it is mostly used to desugar into jumps ?? The bytecode only has jumps, but the source code has for while if ternary-if
, etc.
I am not sure there is any analysis or optimization there based on the graph, though I Haven't looked at that code in a long time
Hm my first thought is - can this be its own pass? like mycpp/control_flow_pass.py which is like const_pass.py
Ha 😅 The first rough version of this PR (I never pushed it) had this in a separate pass, so it definitely can be. I just moved it into cppgen pass in this version to avoid having to maintain two copies of the switch case stuff (which you called out below). We can just pull that logic into mycpp.util
so we can share it, though.
It doesn't seem like this needs to do anything with expressions? which suggests it might be a standalone pass --
This doesn't currently deal with expressions, but it will once we start adding the Fact
annotations. However, I think that will also be OK to have in a separate pass.
Although ternary expressions are technically control flow
Do we do anything about those, and is correctness affected if you don't handle
x() if b else y()
?
I think the only time these matter for control flow is when they make function calls. That case will be handled when we add the function call Fact
s. They'll also matter when we handle assignment statements for the dataflow analysis bits, but those will be covered when we add the Fact
s too.
Also it would be nice to have some technical design info about the control flow graph, either in some docstrings or a separate markdown file
Ah sure. I'll flesh out the doc string. If it starts to sprawl too much I'll move it to a separate file.
Is there one "obvious" kind of control flow graph? Or are there are different ones that are appropriate for different analyses?
What we're building here is a pretty fine-grained or "full flow graph" (according to Wikipedia). Every statement has a node. There are other flavors that group statements into blocks. Otherwise there isn't much variation I think? Anything interesting comes from either the semantics of the underlying language or the facts that you annotate the graph with when doing dataflow analysis (this is where you see lots of variation I imagine). I will include notes about this in the docs.
What properties does the graph obey?
Nothing special. It's just a directed graph with no islands. If you remove the back-edges generated by loops the remaining edges should form a DAG.
OK great, if we can have more pure functions in mycpp/util.py and fewer methods like _write_cases() then I think that will improve things
The visitor tends to "couple" everything together, when really there can be separate functions
I was thinking of mycpp.asdl
, and we probably still want that long term, but separate functions sounds like a good idea
to avoid duplication but also reuse the same structure
OK @andychu I split this back out into a separate pass and pulled the switch case collection into a shared util. I also added a few details to the docstring for ControlFlowGraph
. Have another look?
Thanks! Sorry you had to shuffle it around a lot, but I think this is much better
No sweat. I agree, it's much more obvious what this is doing now without all the other noise
Looks great, thanks!
This commit instruments cppgen_pass.py to emit a control flow graph and adds some tests for a selection of the trickier examples. It also refactors ControlFlowGraph block handling some more and adds better break/continue helpers for loops.