Open Fidget-Spinner opened 6 months ago
We should prioritize this.
Why? do you have evidence that this will make a large enough difference to prioritize over other work? I'm not saying we shouldn't do this. I'm just a bit skeptical that it will make much of a difference.
We should prioritize this.
Why? do you have evidence that this will make a large enough difference to prioritize over other work?
I'm not saying we shouldn't do this. I'm just a bit skeptical that it will make much of a difference.
I don't. I meant that we should prioritize ints over floats initially because my past experiments with the tier 2 interpreter suggest float operations are quite cheap. I dont think any of this will show up on a benchmark necessarily.
Currently we're not doing constant and float propagation on the tier 2 optimizer in main because we don't want to create new objects or hold owned references in the trace.
I propose a way to circumvent this and do constant and float propagation. I will outline the design here. The idea is to pack the raw values into the operand itself.
We then introduce two new instructions:
Which constructs a value from an immediate in the operand by calling
PyLong_FromLong
andPyFloat_FromDouble
.To keep the optimizer peepholes simple, we then introduce variants of the above instructions
Thus the binary operations get constant propagated to the following instructions:
Which will then be cleaned up in a peephole pass as follows: Before:
After
Here's what must happen for this to be a win: For floats:
Py_REFCNT(lhs) == 1
trick that we use to reuse floats, I am not super optimistic about this being a big win for small runs of constants. So for now, this should not be prioritised. My own experiments with the tier 2 interpreter back then when I did the lazy block versioning experiment suggests that floats are not that expensive due to the float reuse. In fact, we might slow things down becausePyFloat_FromDouble
always has to allocate.For ints:
_PyLong_Add
subroutine, which is extremely expensive as it assumes an array oflong
. Note that in the specialized interpreter, float addition is the machinedouble
addition, while int addition is an expensive subroutine. We should prioritize this.