astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
31.42k stars 1.05k forks source link

C408 recomendations sometimes end in hard to find bugs #13039

Open klausmyrseth opened 1 month ago

klausmyrseth commented 1 month ago

I checked for C408 which is several issues on but not this perticular issue which the recommendation potentialy introduse grave bugs in your code.

C408 asks you to use the literal hence {} instead of dict() when creating a dict. This can work very well but for instance when using this together with attrs or as default values for method parameters this is a critical and hard to track bug.

I can describe this with some code:

import attrs

@attrs.define(kw_only=True)
class BigBug:
  struct_1: dict[str, str] | None = field(init=True, default={})
  struct_2: dict[str, str] | None = field(init=True, default={})

  def horrid_buggy_method(self, struct_3: dict = {}, struct_4: dict = {}) -> tuple[dict, dict]:
    struct_3['foo'] = 'bar'
    struct_4['bar'] = 'foo'
    # magically struct 3 and 4 will contain both keys :P
    return (struct_3, struct_4)

foo = BigBug()
foo.struct_1['bar'] = 'bas'
dict_3, dict_4 = foo.horrid_buggy_method()

In this scenarion both struct_1 and struct_2 will contain {'bar': 'bas'} as the literal will imply same class reference as default.

It's not always the case but in general you quickly learn to just use dict() instead as you avoid those edge cases, but we started using ruff in our projects now and our jrs is flooding the code with this bugs as ruff is "correcting" their code.

Edit: added the method too to illustrate the most common facepalm when it comes to this.

klausmyrseth commented 1 month ago

Just adding some explanation of what im trying to indicate and why C408 is not a simple case Comprehensive explanation of the potential bugs that can arise from using {} to introduce dictionaries in Python, along with insights into why this occurs:

Shared Instance Issues

Reasons for Shared Instances

Avoiding Shared Instance Bugs

Edit: Yes I been stuck because of this before 🤣

hauntsaninja commented 1 month ago

Both default=dict() and default={} do exactly the same thing and are equally buggy. You want to use default=attr.Factory(dict) or something instead.

klausmyrseth commented 1 month ago

Did a full history walkthrough of the issue: First of all @hauntsaninja is wrong dict() is not the same as {} atleast pre 3.7 you would get the following result with the folowing code:

# Using dict()
dict1 = dict(default_value=0)
dict2 = dict(default_value=0)

dict1["key"] = 10

print(dict1)  # Output: {'default_value': 0, 'key': 10}
print(dict2)  # Output: {'default_value': 0}

# Using {}
dict3 = {"default_value": 0}
dict4 = {"default_value": 0}

dict3["key"] = 10

print(dict3)  # Output: {'default_value': 0, 'key': 10}
print(dict4)  # Output: {'default_value': 0, 'key': 10}  (Both dictionaries are modified)

so {} instances was cached within same scope, while dict() forces a new instance. But 3.7 did some significant changes to this so the two is more similar, and 3.8 more or less made them equal, so it works the way C408 recomends, in version pre 3.7 this is still a huge issue.

There has also been done some minor changes post 3.8 so now in 3.11 it works very very similar and does not have the old issues we have in our older code repos where this is a screaming issue. But there is no need to do attrs factory handling, you can just use {} now in newer python but remember to use dict() in pre 3.7.

Sorry for not doing the full history check on the feature but its still something that you might want to consider in the ruff test as pre 3.8 atleast it might be a very unwise recomendation.