Open spaceby opened 1 month ago
This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?
I like this issue, I am a new her btw [I'm also new to OSC],
But I did a little research and its variable , so I am not sure its safe to change based off only on aesthetics .
Basically at smaller and mid sized dict updates its ok to just assign but on larger updates creating a secondary dict as update
does , would greatly increase the bulk update speed.
While its arguable , since the performance effects is variable maybe it should be optional.
This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?
Even for small dictionaries, direct assignment is indeed quite a bit faster according to a very naive benchmark:
~/dev % python -m timeit -s "d = {}" "d.update({'key': 'value'})"
5000000 loops, best of 5: 61.2 nsec per loop
~/dev % python -m timeit -s "d = {}" "d['key'] = 'value'"
20000000 loops, best of 5: 12.2 nsec per loop
And this is what you'd expect, since the bytecode created for direct assignment is a lot simpler:
~/dev/cpython % ./python.exe
Python 3.14.0a0 (heads/main:986a4e1b6fc, Sep 26 2024, 12:02:18) [Clang 15.0.0 (clang-1500.3.9.4)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> def x():
... d = {}
... d.update({'key': 'value'})
...
>>> def y():
... d = {}
... d['key'] = 'value'
...
>>> import dis
>>> dis.dis(x, adaptive=True)
1 RESUME 0
2 BUILD_MAP 0
STORE_FAST 0 (d)
3 LOAD_FAST 0 (d)
LOAD_ATTR 1 (update + NULL|self)
LOAD_CONST 1 ('key')
LOAD_CONST 2 ('value')
BUILD_MAP 1
CALL 1
POP_TOP
RETURN_CONST 0 (None)
>>> dis.dis(y, adaptive=True)
1 RESUME 0
2 BUILD_MAP 0
STORE_FAST 0 (d)
3 LOAD_CONST 1 ('value')
LOAD_FAST 0 (d)
LOAD_CONST 2 ('key')
STORE_SUBSCR
RETURN_CONST 0 (None)
But as I said, this benchmark is very naive:
.update()
when in most real-world uses of dict.update
you'd generally pass in a dictionary that has multiple items.@spaceby would your proposed rule only apply to dictionaries with a single item being passed to .update
? Or do you also think it should suggest this?
-d.update({
- "x": 1,
- "y": 2,
- "z": 3,
-})
+d["x"] = 1
+d["y"] = 2
+d["z"] = 3
I think the rule should only apply to a dictionary with single item. A rule similar in meaning to B009 - rewrite the code to something simpler.
Makes sense... though I think we should explicitly state this in the docs for the rule, or we'll inevitably get a PR "improving" the rule so that it also covers dictionaries with multiple items 😄
For the update call with multiple entries, maybe the |=
operator would be an idea? It seems to be slightly faster :thinking:
❯ python -m timeit -s "d = {}" "d |= {'x': 1,'y': 2,'z': 3}"
2000000 loops, best of 5: 141 nsec per loop
❯ python -m timeit -s "d = {}" "d.update({'x': 1,'y': 2,'z': 3})"
2000000 loops, best of 5: 179 nsec per loop
And of course with single items, |=
should also be replaced with directly assigning the value to the key.
For the update call with multiple entries, maybe the
|=
operator would be an idea? It seems to be slightly faster 🤔
Possibly, but I think that would be a different rule
This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?
I still think it's worthwhile to frame the motivation properly because, IMO, the performance difference seems too marginal to justify making it a rule.
Maybe this would be a separate rule, but perhaps this rule could be generalized to avoid creating unnecessary intermediate representations. For example, these:
l += ["value"]
l += ["value 1", "value 2"]
s |= {"value"}
s |= {"value 1", "value 2"}
d |= {"key": "value"}
d.update({"key": "value"})
# ect.
could potentially be rewritten as:
l.append("value")
l.extend(("value 1", "value 2"))
s.add("value")
s.update(("value 1", "value 2"))
d["key"] = "value"
d["key"] = "value"
# ect.
This was first proposed in https://github.com/astral-sh/ruff/issues/8877#issuecomment-1835998576.
This make sense, but is just a style rule right? Is there a performance diff since you're skipping creating a dictionary?
I still think it's worthwhile to frame the motivation properly because, IMO, the performance difference seems too marginal to justify making it a rule.
I think the performance difference is significant enough that it could easily make a difference in a hot loop somewhere... but I agree that the motivation is primarily stylistic here, and that this is the kind of performance difference that could easily change as performance improvements are made to CPython itself. So I agree that this should be classified as a stylistic rule.
Maybe this would be a separate rule, but perhaps this rule could be generalized to avoid creating unnecessary intermediate representations. For example, these:
Yes, this makes sense to me; all of these are essentially different manifestations of the same antipattern.
At this point I think I've seen enough support for this that I'd be happy to accept a PR implementing this rule (in the RUF
category, I suppose, unless there have been any other linters to already implement this?).
I suggest adding a rule that detects this
and replacing it
As a result, the code becomes simpler and more understandable.