jendrikseipp / vulture

Find dead Python code
MIT License
3.38k stars 148 forks source link

Dataclass variables are tagged as unused variables #264

Closed jmmcclaineng closed 1 year ago

jmmcclaineng commented 2 years ago

We have a dataclass like the below we are using for data representation and vulture seems to treat these as unused variables.

Example dataclass:

@dataclass
class DHCPV6Response:
    total: int = 0
    solicit: int = 0
    advertise: int = 0
    request: int = 0
    confirm: int = 0
    renew: int = 0
    rebind: int = 0
    reply: int = 0
    release: int = 0
    decline: int = 0
    reconfigure: int = 0
    information_request: int = 0
    relay_forw: int = 0
    relay_reply: int = 0
    discards: int = 0

Example unused variables:

advertise  # unused variable (adtn_1u_olt/inspect/interface/virtual/modeled_inspect/subscriber/dhcpv6_response.py:17)
confirm  # unused variable (adtn_1u_olt/inspect/interface/virtual/modeled_inspect/subscriber/dhcpv6_response.py:19)
renew  # unused variable (adtn_1u_olt/inspect/interface/virtual/modeled_inspect/subscriber/dhcpv6_response.py:20)
jendrikseipp commented 2 years ago

Thanks for the report! Could you also post the code where the three variables are used?

volans- commented 2 years ago

@jendrikseipp FYI this is happening to me too. In my case the variables are used only in a format() call using self. So something like:

SOME_TEMPLATE = """...{s.foo}...{s.bar}..."""

@dataclass
class MyDataclass:

    foo: str
    bar: str

    def __str__(self):
        return SOME_TEMPLATE.format(s=self)
hughhan1 commented 2 years ago

I also have this same issue. Is it possible to add something similar to the list of ignored decorators, like a list of superclasses such that all subclasses extending those superclasses should be ignored by vulture?

In my specific use case, I have something like as follows.

from abc import ABC
from dataclasses import dataclass, fields
from typing import Sequence, Type

@dataclass(frozen=True)
class Superclass(ABC):
    @classmethod
    def ordered_field_names(cls) -> Sequence[str]:
        return [field.name for field in fields(cls)]

@dataclass(frozen=True)
class Foo(Superclass):
    x: int

@dataclass(frozen=True)
class Bar(Superclass):
    y: int

class Utils:
    @staticmethod
    def get_values(data: Superclass, typ: Type[Superclass]) -> Sequence[int]:
        return [getattr(data, field_name) for field_name in typ.ordered_field_names()]

Utils.get_values(Foo(x=1), Foo)
Utils.get_values(Bar(y=2), Bar)

When I run vulture, I get the following errors.

13: unused variable 'x' (60% confidence, 1 line)
17: unused variable 'y' (60% confidence, 1 line)

This is a result of using getattr, which is dynamic. To get around this, I have been annotating my entire codebase with # noqa for all attributes of all classes that subclass Superclass. This makes the code quite messy. It seems that using whitelists introduces unnecessary complexity into the codebase as well.

Ideally, I would like some configuration such that I can tell vulture to ignore all attributes of all class definitions that extend Superclass. Is this straightforward to implement?

jendrikseipp commented 1 year ago

I think the best solution for these types of situations is to use a whitelist module such as the following:

# mywhitelist.py
from myfile import MyDataclass
MyDataclass.foo
MyDataclass.bar

Then add this module to the list of paths that Vulture scans. You may also want to call python mywhitelist.py in your CI tests to check that all whitelisted items still exist in the code base.

Apakottur commented 1 year ago

The whitelist approach works, but I think that having a --ignore-classes option would be really nice, for dataclasses, pydantic models, named tuples and other dataclass-like classes.

jendrikseipp commented 1 year ago

Why would dataclass-like classes benefit more from a --ignore-classes option than regular classes? In both cases the member variables can be used explicitly or only implicitly. Am I missing something?

Apakottur commented 1 year ago

At least in my use case, dataclasses (and pydantic BaseModel) are often used to declare the attributes of a JSON object coming from an external API. Usually I construct the dataclass attributes (based on a JSON blob, or an API documentation) and then use some/all of them for my logic. Often I am not using all of them in the final result, but I want to keep the whole list of attributes for future reference and to avoid having to parse the JSON blob again.

In regular classes I usually only declare attributes that I myself end up setting and using, so in that case an unused attribute really is unused and vulture is great at pointing that out.

Another good example for external APIs is enums. I might have an enum of strings, which might only be constructed based on a string coming from an external API, in which case I might never reference any of the actual enum values directly.

I might be missing something but this feels like a good use case to ignore classes subclassing BaseModel or decorated by dataclass, for example.

jendrikseipp commented 1 year ago

Thanks for your explanation! I can see the utility of being able to ignore members of all classes that inherit from a given class. However, this is difficult to implement in Vulture, which has no notion of variable scopes. If somebody sees a solution for this, I'm happy to discuss it or review a pull request, though.