usagitoneko97 / klara

Automatic test case generation for python and static analysis library
https://klara-py.readthedocs.io
Other
261 stars 15 forks source link

Multiple errors and confusions in the docs #6

Closed pfalcon closed 3 years ago

pfalcon commented 4 years ago

From formal, computer science point of view, docs included in the project contain multiple confusing, or just incorrect, statements.

Meta-issue trying to pinpoint the issues to help a novice reader, who may get misconceptions after reading these docs.

pfalcon commented 4 years ago

https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#1421-introduction states:

As the result B2 does not dominate any node.

By the definition, every node dominates itself. That's why the concept of "strict dominance" exist, which is explained in just the previous section: https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#141-terminology

pfalcon commented 4 years ago

https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#184-testing-for-complete-ssa-generation

                    a = 3           # 1st
                    if a > 3:       #  |
                        a = 3       # 2nd
                        a = 98
                    else:           # 3rd
                        z = 4       #  |
                    # expected phi func for 'a' here
                    y = a           # 4th
                    a = 4

The comment isn't exactly correct. If z is assigned in one branch of the if, but not the other, there should be phi for it too (in addition to a). The issue here is definitely related to the difference of strict vs non-strict SSA, and it seems that this project seems to be concerned with non-strict SSA, without explicitly stating though. Given that non-strict SSA is largely useless (without "a definition dominates every use", you can't perform many useful transformations), it's usually not fully defined in every aspect, so it's hard to say whether phi's for non-initialized variables should exist or may not exist. So, it's hard to say that it's outright error. It's definitely a noticeable confusion point. It's better to aim for strict SSA, where z is initialized with sentinel "undefined" value on entry, and all phi's are generated as required.

pfalcon commented 4 years ago

https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#144-dominance-frontier

Assume that DF of B5 needs to be found, it will iterate through both of the child, B6 and B8.

There's no B8 in that graph.

pfalcon commented 4 years ago

Ibidem:

Assume that DF of B5 needs to be found, it will iterate through both of the child, B6 and B8. Since B5 dominates both of them, they are not dominance frontier of B5

It's clear from the graph that B5 doesn't dominate B6. Thus, B6 will be the dominance frontier for B5.

pfalcon commented 4 years ago

https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#16-live-variable-analysis

Live variable analysis will shows the lifespan of each variable across multiple blocks in a network of complex block chain, that is, the variables that may live out of the block, or variables that may be potentially read before their next write, or variable that may get killed in a basic blocks.

Nice explanation. With "block chain" already in, can we get "big data" into this sentence too? ;-)

pfalcon commented 4 years ago

https://github.com/usagitoneko97/python-static-code-analysis/tree/master/cfg_and_ssa#171-trivial-ssa

1.7.1. Trivial SSA

Some call this "maximal SSA". Cooper/Torczon, p.497 (according to google books).