jendrikseipp / vulture

Find dead Python code
MIT License
3.38k stars 148 forks source link

False positive when enum is used in iteration #304

Open eltoder opened 1 year ago

eltoder commented 1 year ago

Enum classes allow iteration over the class, which returns all enum members. This is often convenient particularly when writing tests. Unfortunately, vulture does not understand this and reports enum members as unused. For example:

from enum import Enum

class E(Enum):
    A = 1
    B = 2

list(E)

produces these errors:

$ python -m vulture test_enum.py
test_enum.py:4: unused variable 'A' (60% confidence)
test_enum.py:5: unused variable 'B' (60% confidence)

This can be whitelisted, but it's a bit tedious when there are many enum members.

Some ideas on how this can be improved:

  1. It may be possible to detect some cases of iteration like list(E), tuple(E), a, b, c = E, for e in E and mark all members as used.
  2. For a work-around, adding a class-level comment or annotation that all members are used can make whitelisting much easier. This can help in other cases as well, for example this issue about dataclasses: #264.
  3. Maybe if the enum class is listed in __all__, all members can be considered used?
jendrikseipp commented 1 year ago

Thanks for the report! All proposed solutions have in common that they need to find out which variables are defined in the scope of the class. Since Vulture currently has no notion of scope, they are tricky to implement. If you see a solution, I'm all ears :-)

eltoder commented 1 year ago

@jendrikseipp That sounds like an even bigger problem :-) Adding scope information should improve the accuracy quite a bit. Python scoping rules are not very complicated.

jendrikseipp commented 1 year ago

There are some libraries that enrich the abstract syntax tree (on which Vulture operates) with scoping information, but I haven't looked deeper into them. I think that could be a good starting point.

anudaweerasinghe commented 1 year ago

Can I work on this?

jendrikseipp commented 1 year ago

Sure, happy to see progress here! It might be a good idea to discuss the high level approach though, before you spend time on implementing the details. If you have a high-level plan, I'm happy to discuss it.

anudaweerasinghe commented 1 year ago

@jendrikseipp looked into this a little more, and wanted to run the current implementation idea by you...

Progress thus far:

Based on these findings, here is a high level approach of the solution:

jendrikseipp commented 1 year ago

Yes, your explanation makes sense and the approach is correct, I'd say. (In the last step, you probably mean "used_variables".)

The only thing I'm unsure about is the AST library. Have you compared asttokens with https://github.com/kavigupta/ast_scope and https://github.com/pylint-dev/astroid ? We need to pick one that does the job, is well-maintained and doesn't have too many dependencies.