JIT: eliminate tests for completely determined expressions

scalablecory commented 6 years ago

I'd like the JIT to detect when one branch leads to another branch, and fold them into a single branch.

I'm sure this optimization has a specific name in compiler theory, but I don't know it. If you'll excuse a dumb example:

int InitialBranching(string x)
{
    if (x == "foo") return 0;
    if (x == "bar") return 0;
    if (x == "baz") return 1;
    return -1;
}

string SubsequentBranching(string x)
{
    int y = InitialBranching(x);

    if (y == 0) return x + "0";
    if (y == 1) return x + "1";
    return "other";
}

Here, a sufficiently advanced compiler should be able to transform this into something like:

string Folded(string x)
{
    if (x == "foo") goto zero;
    if (x == "bar") goto zero;
    if (x == "baz") goto one;
    goto other;

    zero:
    return x + "0";

    one:
    return x + "1";

    other:
    return "other";
}

Why?

Obviously, the dumb example is not what I'm actually doing. For performance purposes I'm updating a text parser from depending on char to be encoding-agnostic. In some places, it needs to allow injecting inlineable sub-parsers that handles encoding-specific things:

// returns length of newline if matched, 0 if not matched, -1 if more data is needed.
int IsNewLine<T>(ReadOnlySpan<T> data);

This sits right in a hotspot in my code: optimizing this bit makes a huge difference. I've found the extra tests needed to check the return value are causing the code to perform at about 1/4th the speed when compared to a non-agnostic version with IsNewLine manually inlined. Having the compiler optimize away these tests would make a huge difference to me.

category:cq theme:optimization skill-level:expert cost:large

RussKeldorph commented 6 years ago

@dotnet/jit-contrib

AndyAyersMS commented 6 years ago

There are actually two optimizations here.

The first is having the jit recognize that the value of x completely determines the value of y. This should remove the need to materialize y and to test its value. One common technique is to detect when this happens and replicate the blocks testing y for each known reaching definition and then let constant propagation simplify things. See for instance Avoiding conditional branches by code replication. It is generally useful in straightening out control flow after inlining into predicates. Figuring out when this is profitable can be a little tricky as there is often some unrelated code in the test blocks and one must decide whether the cost of duplicating this is reasonable. Or the value of the input may only be known on some paths and be unknown on others. This is something we possibly should consider adding to the jit (see for instance dotnet/runtime#4207). I have a prototype lying around somewhere.
The second is to realize that some of the subsequent blocks are identical and can be merged. This is usually called "tail merging" or "cross jumping". Tail merging doesn't usually shorten path lengths and so for the most part only offers indirect performance benefits. And it can be costly in terms of compile time as one must carefully manage cases with large fan-out (say a switch where many cases execute the exact same code), decide how much wiggle room to allow in duplicate detection (eg can we reorder otherwise independent instructions ...?), and evaluate the merge candidates in proper order (since one merge can enable another). It doesn't tend to help throughput much either -- though in pathological cases it can work wonders.

BruceForstall commented 4 years ago

Another tail merge issue: https://github.com/dotnet/runtime/issues/8795

AndyAyersMS commented 2 years ago

@scalablecory realize it's been a while, but do you have example code we can try?

The jit may be getting close to doing what you expect...

dotnet / runtime

JIT: eliminate tests for completely determined expressions #10828