Closed gdub closed 1 year ago
Possibly not respecting overrides, does look like a bug.
Just noticed this too!
def f(y):
x = 1
if y:
x = 2
return x
$ ruff tmp.py --isolated --select RET
tmp.py:6:12: RET504 Unnecessary variable assignment before `return` statement
Found 1 error.
It seems the RET checks don't handle branches very well (the "else after" ones also are broken when you have a chain).
It may wants you to do this, even though it's awkward:
def f(y):
if y:
return 2
return 1
the "else after" ones also are broken when you have a chain
Is this on main
? I removed enforcement for elif
chains in the most recent release.
(I find these rules really tedious lol. Maybe they should just be removed.)
They will come back again when adding pylint rules. :) I like these in pylint a lot, which I is why I'm happy to see them showing up in a faster pre-commit friendly form, but the pylint version is solid and doesn't trigger falsely, while RET seems to not properly handle branching (I expect it's a problem with the flake8 plugin).
It may wants you to do this, even though it's awkward
It's not practical in the actual code that caused me to look into this. The fix should be to just only enforce x = value; return value
and not if the assignment is in an if statement (with statement is 90% of the time fine, just not if statements)
I haven't rerun with the most recent release yet for the RET 507 and similar rules, quite possibly fixed!
Ah yeah, I mostly meant that I have issues with the implementation. It's a direct port of the flake8 plugin, and it just has a bunch of limitations (in addition to being hard to reason about).
Ahh, I see what you mean.
def f(y):
x = 1
if y:
return 2
return x
Would pass. I could live with that in some/most cases.
I think it will still complain about x=1
:(
In a realistic example (where other things are happening to the x before the final if), it seems pretty happy. I didn't end up with any false positives that I couldn't handle knowing that it was fine just changing the assignment in the if to a return. There were several when converting Plumbum and https://github.com/scikit-hep/particle/pull/472/commits/10caa90bb139f6b754bdad1ba577c832e4e3b465 was the one that prompted me to search for this issue.
Just to add one more (almost) real life example: I have this pattern that seems reasonable in Django to build a queryset based on different inputs
def get_queryset(option_1, option_2):
queryset: Any = None
queryset = queryset.filter(a=1)
if option_1:
queryset = queryset.annotate(b=Value(2))
if option_2:
queryset = queryset.filter(c=3)
return queryset
main.py:15:12: RET504 Unnecessary variable assignment before return statement
This would be how you’d make it happy:
def get_queryset(option_1, option_2):
queryset: Any = None
queryset = queryset.filter(a=1)
if option_1:
queryset = queryset.filter(b=2)
if option_2:
return queryset.filter(c=3)
return queryset
Not quite as symmetric, but not really “worse” from a readers standpoint either. I’d probably disable the check if this was a lot of very symmetric conditions (like you have described), though in practice most of them are not that symmetric and are fine written with the extra return. Overriding a variable is a bit ugly, so avoiding an unnecessary one isn’t that bad.
Personally, I dislike the resulting work-around because:
One idea would be to only flag the single-assignment and return scenarios, e.g.:
def get_queryset():
queryset = Model.filter(a=1)
return queryset
...and not flag scenarios where there are multiple assignments to the same variable before returning, e.g.:
def get_queryset(option_1, option_2):
queryset = Model.filter(a=1)
if option_1:
queryset = queryset.filter(b=2)
if option_2:
queryset = queryset.filter(c=3)
return queryset
Though, I guess that would also mean that something like this would pass not get flagged:
def get_queryset():
queryset = Model.filter(a=1)
queryset = queryset.filter(c=3)
return queryset
Which raises the question of what is it that this rule is really meant to be targeting? Cases where would save a line of code due to an unnecessary assignment? Should check be simplified to only look for the simple assign/return case that happens at the same nesting level, and ignore if there is more complex branching/looping involved?
Though, I guess that would also mean that something like this would pass not get flagged:
If the rule should only be ignored if the assignment is under if
clause.
I agree with @gdub . In cases similar to the one mentioned by him, I usually do # noqa: RET504
, but it would be very good if that kind of usage got exempted from the rule.
Don't know if it is simple for ruff to detect it though =P
The multiple assignments heuristic seems like it could work... pretty well? In my personal opinion I actually prefer this to inlining return queryset.filter(c=3)
:
def get_queryset():
queryset = Model.filter(a=1)
queryset = queryset.filter(c=3)
return queryset
It also has the nice property that if you add another operation to the fluent chain, the diff much clearer. But, it's all a bit subjective of course.
I think the intent of the rule is to capture "trivial" errors like:
x = 1
return x
Or:
x = 1
print("Got 1")
return x
Part of me is like... is this even a rule worth having? I wish I had data on how many people are using it and how often it's # noqa
'd or ignored, etc.
I wish I had data on how many people are using it and how often it's
# noqa
'd or ignored, etc.
Can you run Ruff on a corpus of projects and gather rule hit counts and a random sampling of examples? MyPy does something like that with its MyPy Primer, which they use to evaluate pull requests. It might be nice to see how various changes make rules more or less sensitive, and examples of lines that would be newly flagged, and also examples of lines would no longer be flagged.
I'd like to do something like this: index projects using Ruff, and get some data on their configurations (and on how often they noqa
various rules).
Interesting looking through some results from: https://github.com/search?q=noqa%3A+RET504&type=code https://github.com/search?q=RET504&type=Code
Shows up in some ignores with comments, e.g.:
# RET504: Unnecessary variable before return (ugly)
# suppress what appears to be false-positives with these checks
'RET504',
'RET505',
'RET506',
'RET507',
'RET508',
See many examples with similar chaining-style assignment with conditionals that we've touched on already.
Here's an interesting case where variable is assigned to, referenced, and then returned: https://github.com/KyleKing/pytest_cache_assert/blob/47d0b538528a9c9cdbc182d2b276462a2aa0fd4d/tests/test_flask.py#L20
def _create_app() -> Flask:
app = Flask(__name__)
@app.route('/hello')
def hello() -> str:
"""Hello endpoint."""
return 'Hello, World!'
return app # noqa: RET504
Seems like something that shouldn't be flagged.
Perhaps instead of an assignment count, heuristic could be reference count. In other words, excluding return statement, flag if the only reference was assignment.
This PR productionizes that suggestion (avoid RET504 for variables with multiple assignments): https://github.com/charliermarsh/ruff/pull/3393.
Feedback welcome, I won't be merging tonight.
(Fixed the decorator thing in #3395, separate bug!)
Ok, I looked through the GitHub Code Search, and I do see a lot of examples of this kind of "fluent interface"-like assignment style, e.g.:
command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
command = self.validate_command(command)
command = self.do_magic(command=command)
return command # noqa: RET504
This strikes me as very reasonable code, so I'm gonna go ahead and merge that PR.
Personally, like moving the return up one, reducing the number of variable overwrites there (and that sort of code isn't very good for static typing unless every command returns the same thing!)
I don't see any problem with:
command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
command = self.validate_command(command)
return self.do_magic(command=command)
In fact, there's only one overwrite now, so I'd actually do:
command = self.extract_command(text)
self.validate_prefix(command=command)
await self.validate_mention(bot=bot, command=command)
validated_command = self.validate_command(command)
return self.do_magic(command=validated_command)
IMO, this now reads much better. The point of style checks is to force better style, not to try not to bother anyone or conform to what everyone is doing if it's wrong. I think the biggest problem with this check was that it wasn't clear how to "fix" it from the message (and also some false positives, like the function false positive above). And it's totally fine to disable a style check you don't agree with! Whenever I talk about style checking, I emphasize this (especially WRT PyLint).
Also, one of the points of this sort of check is to help two people writing the same logic get the same code. Returning without assigning is perfectly valid, so it's reasonable to force this rather than force an assignment always before return.
For example, I think this would have read much better without an extra assignment: https://github.com/scikit-build/scikit-build-core/pull/197/files#diff-5a9598893dbb4007601522cfb26b27c92d790c18bf35d7bfe86205ae1955fa0bR47-R48
I'm not sure why Ruff didn't find that, as I'd guess #3393 wasn't in the version of Ruff that ran on that, but I wish it did.
Example:
Output:
In this case, there is no unnecessary variable assignment that could be removed and cannot do an earlier
return
as need to get through all the conditionals.That said, the error does not trigger if one were to instead use an in-place operator such as
+=
:Version: