oilshell / oil

Oils is our upgrade path from bash to a better language and runtime. It's also for Python and JavaScript users who avoid shell!
http://www.oilshell.org/
Other
2.78k stars 150 forks source link

[mycpp] add some CFG tests for examples #1977

Closed melvinw closed 1 month ago

melvinw commented 1 month ago

This commit instruments cppgen_pass.py to emit a control flow graph and adds some tests for a selection of the trickier examples. It also refactors ControlFlowGraph block handling some more and adds better break/continue helpers for loops.

andychu commented 1 month ago

Also it would be nice to have some technical design info about the control flow graph, either in some docstrings or a separate markdown file

Is there one "obvious" kind of control flow graph? Or are there are different ones that are appropriate for different analyses?

What properties does the graph obey?

I have not worked with them very much -- I only refactored the Python bytecode generator in opy/. There I think it is mostly used to desugar into jumps ?? The bytecode only has jumps, but the source code has for while if ternary-if, etc.

I am not sure there is any analysis or optimization there based on the graph, though I Haven't looked at that code in a long time

melvinw commented 1 month ago

Hm my first thought is - can this be its own pass? like mycpp/control_flow_pass.py which is like const_pass.py

Ha 😅 The first rough version of this PR (I never pushed it) had this in a separate pass, so it definitely can be. I just moved it into cppgen pass in this version to avoid having to maintain two copies of the switch case stuff (which you called out below). We can just pull that logic into mycpp.util so we can share it, though.

It doesn't seem like this needs to do anything with expressions? which suggests it might be a standalone pass --

This doesn't currently deal with expressions, but it will once we start adding the Fact annotations. However, I think that will also be OK to have in a separate pass.

Although ternary expressions are technically control flow

Do we do anything about those, and is correctness affected if you don't handle x() if b else y() ?

I think the only time these matter for control flow is when they make function calls. That case will be handled when we add the function call Facts. They'll also matter when we handle assignment statements for the dataflow analysis bits, but those will be covered when we add the Facts too.

melvinw commented 1 month ago

Also it would be nice to have some technical design info about the control flow graph, either in some docstrings or a separate markdown file

Ah sure. I'll flesh out the doc string. If it starts to sprawl too much I'll move it to a separate file.

Is there one "obvious" kind of control flow graph? Or are there are different ones that are appropriate for different analyses?

What we're building here is a pretty fine-grained or "full flow graph" (according to Wikipedia). Every statement has a node. There are other flavors that group statements into blocks. Otherwise there isn't much variation I think? Anything interesting comes from either the semantics of the underlying language or the facts that you annotate the graph with when doing dataflow analysis (this is where you see lots of variation I imagine). I will include notes about this in the docs.

What properties does the graph obey?

Nothing special. It's just a directed graph with no islands. If you remove the back-edges generated by loops the remaining edges should form a DAG.

andychu commented 1 month ago

OK great, if we can have more pure functions in mycpp/util.py and fewer methods like _write_cases() then I think that will improve things

The visitor tends to "couple" everything together, when really there can be separate functions

I was thinking of mycpp.asdl, and we probably still want that long term, but separate functions sounds like a good idea

to avoid duplication but also reuse the same structure

melvinw commented 1 month ago

OK @andychu I split this back out into a separate pass and pulled the switch case collection into a shared util. I also added a few details to the docstring for ControlFlowGraph. Have another look?

melvinw commented 1 month ago

Thanks! Sorry you had to shuffle it around a lot, but I think this is much better

No sweat. I agree, it's much more obvious what this is doing now without all the other noise

andychu commented 1 month ago

Looks great, thanks!