faster-cpython / ideas

1.67k stars 49 forks source link

Making conversions to booleans in jumps explicit. #568

Closed markshannon closed 12 months ago

markshannon commented 1 year ago

POP_JUMP_IF_TRUE and POP_JUMP_IF_FALSE implicitly convert their argument to a boolean before choosing which way to branch. "Explicit is better than implicit"

If we make the conversion explicit: POP_JUMP_IF_TRUE becomes TO_BOOL; POP_JUMP_IF_IS_TRUE

Where TO_BOOL converts TOS to a boolean and POP_JUMP_IF_IS_TRUE performs the jump:

inst(TO_BOOL, (x -- b)) {
     if (PyBool_Check(x)) {
         b = x;
     }
     else {
         int err = PyObject_IsTrue(x);
         DECREF(x);
         ERROR_IF(err < 0, error);
         b = err ? Py_True : Py_False;
         Py_INCREF(b);
    }
}

TO_BOOL should specialize nicely.

inst(POP_JUMP_IF_IS_TRUE, (cond -- )) {
    _Py_DECREF_NO_DEALLOC(cond);
    if (cond == Py_True) {
        JUMPBY(oparg);
    }
}

Compare-and-branch

Now consider how this combines with compares: Branch? No compare Compare
No --- COMPARE_OP
Yes TO_BOOL ; POP_JUMP_IF_IS_TRUE COMPARE_OP; TO_BOOL; POP_JUMP_IF_IS_TRUE

Now, all of the current specializations, and almost all of the possible specializations of COMPARE_OP produce a boolean value. We take advantage of that by combining COMPARE_OP and TO_BOOL by using remaining bit of the oparg to indicate whether we should convert the result to a boolean:

inst(COMPARE_OP, (unused/1, left, right -- res)) {
    res = PyObject_RichCompare(left, right, oparg>>4);
    DECREF_INPUTS();
    ERROR_IF(res == NULL, error);

    if (oparg & TO_BOOL_BIT) {
        /* convert to boolean, as TO_BOOL above */
    }
}

The above version of COMPARE_OP specializes as effectively as before, removes the bool conversion from POP_JUMP_IF_TRUE and makes the conversion to bool explicit, which should help specialization.

markshannon commented 1 year ago

@brandtbucher I think we discussed adding something like TO_BOOL in the context of specializing UNARY_NOT. Was there an issue or discussion for that?

brandtbucher commented 1 year ago

Update: I don't know if there was a discussion, but I'm experimenting with this now at the sprints. Note that we probably get an even bigger win from this now, since True and False are no longer refcounted.

My approach is going to be to prefix all conditional jumps (with a non-guaranteed-bool) with UNARY_NOT, invert the condition of the jump, and only handle bool in the actual branches.

Then we can specialize UNARY_NOT, and experiment with using a free bit in COMPARE_OP to convert the arg to bool.

(We could also specialize the two jumps themselves if the extra overhead from UNARY_NOT proves too great.)

gvanrossum commented 1 year ago

Did this ever produce any results? @brandtbucher

brandtbucher commented 1 year ago

Still in progress. So no, but maybe.

I have a branch where all of the conditional jumps require an exact bool, and are prefixed by UNARY_NOT for explicit conversion. Next step is to try specializing UNARY_NOT. It's just sort of on the backburner right now while I help with 3.12 stuff.

markshannon commented 1 year ago

We could change TO_BOOL to TO_BOOL_OR_INT as bools are laid out the same as ints in memory. The test for truth in the following jump can use _PyLong_IsZero.

inst(POP_JUMP_IF_FALSE, (cond -- )) {
    JUMPBY(oparg * _PyLong_IsZero(cond));
}

Should save a specialization, and speedup tests on ints.

brandtbucher commented 1 year ago

Should save a specialization, and speedup tests on ints.

The stats suggest this probably isn't worth it. TO_BOOL_BOOL is 84% of all TO_BOOL instructions, while TO_BOOL_INT is only 3%. Even TO_BOOL_NONE is more effective, at around 5%.

markshannon commented 12 months ago

This is now complete. Thanks @brandtbucher