faster-cpython / ideas

1.68k stars 48 forks source link

Conditional macro-op expansion? #595

Closed gvanrossum closed 1 year ago

gvanrossum commented 1 year ago

In #592 (and https://github.com/python/cpython/issues/104909) we're trying to split bytecode ops into micro ops (uops) with the constraint of one "data item" per uop, where a data item is either oparg or a cache item.

This runs into some issues with conditional stack effects. For example, take the specializations for LOAD_GLOBAL (LOAD_GLOBAL_MODULE and LOAD_GLOBAL_BUILTIN) and for LOAD_ATTR (e.g. LOAD_ATTR_INSTANCE_VALUE). These opcodes must end in something that pushes either a NULL and a result, or just a result, depending on the low bit of oparg. But the result being pushed also depends on a cached value. We also can't push the NULL before the last DEOPT_IF or ERROR_IF call.

Maybe we can add the conditionality to the macro itself, e.g.

macro(FOO) = (oparg & 1) ? A + B + C : A + C;  // Syntax to be bikeshedded

Here B would be the uop that pushes NULL.

Or maybe the null if (oparg & 1) phrasing of the output stack effect in the uop definition is enough?

For reference:

gvanrossum commented 1 year ago

Tentative decision reached offline: expand uops with conditional stack effects in the generator.


Currently the generator disallows conditional stack effects in uops (it fails an assert not effect.cond, effect in get_stack_effect_info). The plan is to just support this when generating the bytecode interpreter (possibly only in the final uop of a macro?).

But for the superblock generator we need to generate two separate uops, let's give them _0 and _1 suffixes, corresponding to the condition being 0 or 1 -- we can just replace oparg with a constant (relying on the C compiler to elide the dead code). The metadata for macros will encode the condition so that the superblock generator can decide which of the two (similar) uops to put in the superblock. The rest of the optimization and machine code generation pipeline then doesn't need to worry about uops with conditional stack effects.


The failing assert is only part of the issue -- that code is only used to generate the pushed/popped metadata. There's significant additional tech debt around the actual STACK_GROW (etc.) macros generated at the end of a macro instruction -- this currently only supports unconditional effects. This is a TODO in wrap_super_or_macro.

gvanrossum commented 1 year ago

This is done. Conditional output stack effects in the last op composing a macro are now supported.