GH-118095: Unify the behavior of tier 2 `FOR_ITER` branch micro-ops

markshannon commented 3 weeks ago

Simplifies and unifies the behavior of _GUARD_NOT_EXHAUSTED_RANGE _GUARD_NOT_EXHAUSTED_LIST _GUARD_NOT_EXHAUSTED_TUPLE _FOR_ITER_TIER_TWO

Such that all leave just the iterator on the stack and they exit to the POP_TOP immediately after the associated END_FOR.

This fixes a bug in the tier 2 handling of _FOR_ITER_TIER_TWO where errors were treated as occurring at the jump target, not an the original instructions.

Issue: gh-118095

markshannon commented 2 weeks ago

The stats are a bit confusing.

The number of traces executed goes up, but the number of uops executed goes down, as we would expect. However, there is a large increase in the number of tier 1 FOR_ITER_TUPLE and FOR_ITER_LIST instructions executed.

Looking at the number of instructions executed for the various tier 1 and tier 2 FOR_ITER variants, what's happening becomes clearer:

Specialized

FOR_ITER_TUPLE +118M FOR_ITER_LIST +236M FOR_ITER_RANGE +22M

_ITER_CHECK_TUPLE + 165M _ITER_CHECK_LIST + 87M _ITER_CHECK_RANGE + 65M

Unspecialized

FOR_ITER -16M _FOR_ITER_TIER_TWO -1081M

Specialization is being improved: we are executing more specialized T1 and T2 variants and much fewer unspecialized FOR_ITER and _FOR_ITER_TIER_TWOs.

brandtbucher commented 2 weeks ago

The test hangs for tier two seem to be in test_capi.test_misc.TestPendingCalls. I have a hunch the culprit is GH-117442... so that should probably be fixed first?

python / cpython

GH-118095: Unify the behavior of tier 2 `FOR_ITER` branch micro-ops #118420

Specialized

Unspecialized