jendrikseipp / vulture

Find dead Python code
MIT License
3.38k stars 148 forks source link

Improve unreachable code analysis #302

Open kreathon opened 1 year ago

kreathon commented 1 year ago

Motivation

def f():
    try:
        return
    except:
        raise Exception()
    print("Unreachable")

At the moment, the print statement is not identified as unreachable.

Implementation

I added a data structure (no_fall_through_nodes) that stores ast nodes that do not allow a fall through. During the ast traversal continue, break, raise and return are added into the this data structure. For every control flow statement / object (for, try, while, with, but also module and functiondef), we check if any of the statement in the body is in no_fall_through_nodes. If that is the case, we report the next statement (if there is any) and add the current node also into no_fall_through_nodes. To make this work generic_visit (means visiting children) is now executed before visiting the current node.

The algorithm was added into the existing Vulture object with the generic_visit (which handles recursion). I think a cleaner implementation could implement a separate node visitor (that handles its own recursion), but I was not sure if the project is open for that.

The old error reporting message is reused (e.g. unreachable code after 'try'). I am not sure if this is the best message or if all unreachable messages should be simplified to something like unreachable code.

Limitations

Related Issue

This PR is addressing the issue https://github.com/jendrikseipp/vulture/issues/270.

Checklist:

codecov-commenter commented 1 year ago

Codecov Report

Merging #302 (a59eac4) into main (fc23f33) will increase coverage by 0.06%. Report is 13 commits behind head on main. The diff coverage is 100.00%.

:exclamation: Current head a59eac4 differs from pull request most recent head 8d9fcf4. Consider uploading reports for the commit 8d9fcf4 to get more accurate results

@@            Coverage Diff             @@
##             main     #302      +/-   ##
==========================================
+ Coverage   98.96%   99.03%   +0.06%     
==========================================
  Files          21       21              
  Lines         679      727      +48     
==========================================
+ Hits          672      720      +48     
  Misses          7        7              
Files Changed Coverage Δ
vulture/core.py 98.75% <100.00%> (+0.21%) :arrow_up:
vulture/whitelists/ast_whitelist.py 100.00% <100.00%> (ø)

... and 2 files with indirect coverage changes

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

kreathon commented 11 months ago

This goes in the direction of adding a second NodeVisitor class, but avoids traversing the AST twice. Or do you think a second NodeVisitor class (with all of these visitor_* functions) would be a better alternative?

I do not see a clear advantage of one of the solutions and it probably does not matter too much right now (if the reachability is in a second module it should be really easy to switch between them anyway).

I will update the code according to your proposed code organization.

kreathon commented 11 months ago

I am not sure if passing _define as report into Reachaiblity is the cleanest solution, but I did not want to create a new datastructure that passes the information back to the Vulture object. What do you think?

kreathon commented 11 months ago

I had a look at it again, and I think it is cleaner to also move the _handle_conditional_node into the reachablitly.py (such that the module handles the entire self.unreachable_code LoggingList data structure).

I will propose the change as soon as I find time for it.

kreathon commented 11 months ago

I updated

For example.

while True:
    raise Exception()
print(b)

I could handle this in another PR.