faster-cpython / ideas

1.67k stars 49 forks source link

Convert `DELETE_FAST` to `LOAD_FAST_CHECK POP_TOP DELETE_FAST` as a size optimization #490

Open markshannon opened 1 year ago

markshannon commented 1 year ago

Both LOAD_FAST_CHECK and DELETE_FAST check for NULL and raise. Replacing DELETE_FAST with LOAD_FAST_CHECK; POP_TOP DELETE_FAST could save some space.

How does replacing one instruction with three save space? Although DELETE_FAST is dynamically rare, DELETE_FAST is statically much more common as it is used in every except Exception as name: block. Currently at the end of named exception block we emit

LOAD_CONST None
STORE_FAST n
DELETE_FAST n

In the vast majority of these cases we could safely emit just the DELETE_FAST, using the same data flow analysis we use to convert almost all LOAD_FAST_CHECKs into LOAD_FAST

We can implement this by adding DELETE_FAST_CHECKED (for del) and DELETE_FAST_MAYBE_NULL (for the end of named exception blocks) virtual instructions to the compiler. The data flow analysis would convert them to DELETE_FAST if it can prove that the local is defined. The backend would convert the remaining DELETE_FAST_CHECKED to LOAD_FAST_CHECK; POP_TOP DELETE_FAST and the remaining DELETE_FAST_MAYBE_NULL to LOAD_CONST None STORE_FAST DELETE_FAST.

@sweeneyde interested?

sweeneyde commented 1 year ago

Sure, I can work on this.

Another thing to look at, as you had suggested at one point, is that there could be a NULL_CHECK instruction, and that would work for both LOAD_FAST and DELETE_FAST. That should become possible once https://github.com/python/cpython/pull/99075 is merged.

markshannon commented 1 year ago

NULL_CHECK; DELETE_FAST(unchecked) and LOAD_FAST_CHECK; DELETE_FAST(unchecked) are equivalent. The question is which is faster. Are there more loads that need checking, or more deletes that need checking? I suspect that there are very few of each.