PyCQA / flake8-bugbear

A plugin for Flake8 finding likely bugs and design problems in your program. Contains warnings that don't belong in pyflakes and pycodestyle.
MIT License
1.06k stars 104 forks source link

New check: duplicate types in type hint, or duplicate literals in set #371

Open jakkdl opened 1 year ago

jakkdl commented 1 year ago
a: int | int | float = ...
b: Union[List[int, int]] = ...
c: Optional[int] | int | None = ... # can probably rely on other linters to not have to deal with this one
d: set[int] = {1, 2, 3, 3, 5}

e: Tuple[int, int] = ... # not an error

these should pretty much always be typos and/or bad copy-pastes, with very few to no false alarms. Don't think it's super common, but definitely not rare or super esoteric.

The check shouldn't have to be too complicated to implement, for the type hint probably need a whitelist of what types you look in (List/list, Set/set, Dict/dict (with special logic), OrderedDict, Optional, Union, Literal...).

For sets you only need to look inside set() (ast.Call) and inside {} (ast.Set), and probably only include literal int/float/strings.

cooperlees commented 1 year ago

This seems super uncontroversial if we can AST parse this out nicely covering the edge cases you've outlined and we can improve on reports ... I'd be happy to add this.

We can run this over some large type checked repos before releasing too to be extra safe.

AlexWaygood commented 1 year ago

FYI we already have a check for duplicate items in a union over at flake8-pyi (error code Y016). But we only support .pyi files currently. Feel free to take inspiration from our implementation :)

Our tests for Y016 are here, along with our tests for our various other union-related lints: https://github.com/PyCQA/flake8-pyi/blob/main/tests/union_duplicates.pyi

FozzieHi commented 1 year ago

I think these should be two different checks, but I'm happy to work on a duplicate items in a set check.

For sets you only need to look inside set() (ast.Call) and inside {} (ast.Set), and probably only include literal int/float/strings.

What do you mean by checking ast.Call? Do we also want to catch these? set((1, 2, 3, 3, 5)) set([1, 2, 3, 3, 5])

I'm not entirely sure how common these would be, and it would increase the complexity of the check, but I could be missing another syntax here!

jakkdl commented 1 year ago

I think these should be two different checks, but I'm happy to work on a duplicate items in a set check.

For sets you only need to look inside set() (ast.Call) and inside {} (ast.Set), and probably only include literal int/float/strings.

What do you mean by checking ast.Call? Do we also want to catch these? set((1, 2, 3, 3, 5)) set([1, 2, 3, 3, 5])

I'm not entirely sure how common these would be, and it would increase the complexity of the check, but I could be missing another syntax here!

oops, I misremembered and thought one could write set(1, 2, 3) when writing the OP - but that's very much not valid code - so it should only be for ast.Set.

JelleZijlstra commented 1 year ago

I'm supportive of doing this for sets and also for dict keys, where it's likely to mean the code silently does something different from what the user wants.

Less sure about the other examples brought up here. List[int, int] is invalid to a type checker, and if you're not running a type checker annotations are unlikely to be correct anyway, so we wouldn't add much value. Duplicate values in unions (int | int | float, Optional[int] | int | None) aren't that bad; I think that's better left to type checkers too.