astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.
https://docs.astral.sh/ruff
MIT License
31.41k stars 1.05k forks source link

C409 now makes code slower #12912

Open hauntsaninja opened 1 month ago

hauntsaninja commented 1 month ago

Introduced in https://github.com/astral-sh/ruff/pull/12657

tuple(i for i in range(10000)) is like 50% slower than tuple([i for i in range(10000)]). I think it's still fine to have this as a lint, but it's now impossible to configure ruff to distinguish between the two.

This is a thematically similar complaint to https://github.com/astral-sh/ruff/issues/8884

hauntsaninja commented 1 month ago

Oh this is also similar to https://github.com/astral-sh/ruff/issues/10838

charliermarsh commented 1 month ago

These changes are all in preview and so we can still decide to change them before stabilizing. If we revert that change, though, we should do the same for the list rule variants.

I don’t feel strongly about it. I would also be open to making it configurable. Perhaps the setting should state whether the user prefers a comprehension or a generator, and enforce it one way or another? Or would the setting just ignore comprehensions?

hauntsaninja commented 1 month ago

By list rule variant do you mean C410? I think that one is fine because it never introduces a generator comprehension.

It's maybe more dangerous to enforce list comprehensions over generator comprehensions. Also for probably mostly theoretical memory reasons, despite the perf impact, people typically seem more eager to go from list -> generator than generator -> list. So maybe I lean towards a setting to leave list comprehensions alone.

Or actually, maybe the slowening parts of C409 and C419 should be split into separate rules. The new rule that applies to comprehensions could then have a setting for whether you prefer list comprehensions or generator comprehensions (defaulting to generator). I think this would let everyone configure their ideal behaviour.

charliermarsh commented 1 month ago

Sorry, I misspoke. I thought there was a rule that changed:

list([f(x) for x in foo])

To:

list(f(x) for x in foo)

But no, the fix there is [f(x) for x in foo], so that's fine.

charliermarsh commented 1 month ago

I generally agree that this behavior should be configurable (or even removed). What do you think, @AlexWaygood?

AlexWaygood commented 1 month ago

I agree that these should either be two rules (one that makes your code slower and one that doesn't), or one configurable rule (where the default is to only emit the recommendations that do not make your code slower).

Some people prefer to always remove inner parentheses in calls like tuple([x for x in foo]) because they find it annoying when they're not necessary for semantics, and/or they feel that the inner parentheses are a "micro-optimisation" that you should only worry about in very performance-sensitive code. Other people prefer to always include them, because they want their code to be as fast as possible. I think both are reasonable points of view, but we shouldn't mix recommendations that have no negative performance implications with ones that could seriously slow down your code in the same rule. (At least, not with our default settings.)

Avasam commented 1 month ago

tuple(i for i in range(10000)) is like 50% slower than tuple([i for i in range(10000)]).

Oh this is also similar to #10838

And similar to the exact kind of example I requested there: https://github.com/astral-sh/ruff/issues/11839 I wouldn't mind merging my Feature Request with an already open issue.