konsoletyper / teavm

Compiles Java bytecode to JavaScript, WebAssembly and C
https://teavm.org
Apache License 2.0
2.55k stars 262 forks source link

Jdk 21 switch #764

Closed Ihromant closed 9 months ago

Ihromant commented 9 months ago

Hi @konsoletyper Here is initial implementation for JDK 21 switch. It works for null check now. The problem I stuck is that we need to emulate loop which works as commented code block. I suppose that it should be basic block which will be executed with specific value of restartIdx. The main issue here is that in this block RuntimeConstants (labels) should be executed together with ValueEmitters (target and restartIdx). In this block we need to convert somehow RuntimeConstant to Emitter (which one I currently don't know). It should be something like

SomeEmitter currentPattern = labels.get(restartIdx.getValueSomehow()); // here is main trouble for now

Could you please advice how to achieve this?

konsoletyper commented 9 months ago

Ok, here's my hint. Consider you have labels like these:

then the generated code would be:

switch (restartIndex) {
  case 0:
    if (value instanceof A) {
      label = 0;
      break;
    }
  case 1:
    if (value instanceof B) {
      label = 1;
      break;
    }
  case 2:
    if (value instanceof C) {
      label = 2;
      break;
    }
}
Ihromant commented 9 months ago

Tried to implement proposed switch. It fails strangely with ClassCastException in JS tests.

konsoletyper commented 9 months ago

@Ihromant what's the problem to debug and identify this CCE?

konsoletyper commented 9 months ago

@Ihromant you are not using choose properly. It does not support fall-through. You should create SwitchInstruction manually.

Ihromant commented 9 months ago

Tried to play with manual instruction creation, currently unable to produce compilable program. Therefore not committing them. Main issue:

java.lang.RuntimeException: java.lang.NullPointerException: Cannot invoke "org.teavm.model.Instruction.acceptVisitor(org.teavm.model.instructions.InstructionVisitor)" because the return value of "org.teavm.model.BasicBlock.getLastInstruction()" is null

despite all blocks have at least one instruction. Once it was compiled, but it just substituted -1 to switch (obtained from isNull check) and ignoded other. I suppose that I don't understand well how ProgramEmitter accepts instructions and is combined with emitters.

konsoletyper commented 9 months ago

Can you please show latest commit which produces this exception?

Ihromant commented 9 months ago
@Override
    public ValueEmitter substitute(DynamicCallSite callSite, ProgramEmitter pe) {
        List<RuntimeConstant> labels = callSite.getBootstrapArguments();
        ValueEmitter target = callSite.getArguments().get(0);
        ValueEmitter restartIdx = callSite.getArguments().get(1);
        BasicBlock joint = pe.prepareBlock();
        PhiEmitter result = pe.phi(ValueType.INTEGER, joint);
        BranchingInstruction bi = new BranchingInstruction(BranchingCondition.NULL);
        bi.setOperand(target.getVariable());
        BasicBlock isNull = pe.prepareBlock();
        AssignInstruction nullAssign = new AssignInstruction();
        nullAssign.setAssignee(pe.constant(-1).getVariable());
        nullAssign.setReceiver(result.getValue().getVariable());
        isNull.add(nullAssign);
        bi.setConsequent(isNull);

        BasicBlock sw = pe.prepareBlock();
        SwitchInstruction insn = new SwitchInstruction();
        insn.setCondition(restartIdx.getVariable());
        BasicBlock def = pe.prepareBlock();
        AssignInstruction defAssign = new AssignInstruction();
        defAssign.setAssignee(pe.constant(labels.size()).getVariable());
        defAssign.setReceiver(result.getValue().getVariable());
        def.add(defAssign);
        JumpInstruction jump = new JumpInstruction();
        jump.setTarget(joint);
        def.add(jump);
        insn.setDefaultTarget(def);
        sw.add(insn);
        bi.setAlternative(sw);
        pe.addInstruction(bi);

        pe.enter(joint);
        return result.getValue();
    }
konsoletyper commented 9 months ago

There was no need to rewrite everything and to throw away emitter DSL completely. Just add little SwtichInstruction manually:

        List<RuntimeConstant> labels = callSite.getBootstrapArguments();
        ValueEmitter target = callSite.getArguments().get(0);
        ValueEmitter restartIdx = callSite.getArguments().get(1);
        BasicBlock joint = pe.prepareBlock();
        PhiEmitter result = pe.phi(ValueType.INTEGER, joint);
        pe.when(() -> target.isNull()).thenDo(() -> {
            pe.constant(-1).propagateTo(result);
            pe.jump(joint);
        });

        var switchInsn = new SwitchInstruction();
        switchInsn.setCondition(restartIdx.getVariable());
        pe.addInstruction(switchInsn);

        var block = pe.prepareBlock();
        pe.enter(block);
        for (var i = 0; i < labels.size(); ++i) {
            var entry = new SwitchTableEntry();
            entry.setCondition(i);
            entry.setTarget(block);
            switchInsn.getEntries().add(entry);

            var label = labels.get(i);
            emitFragment(target, i, label, pe, result, joint);

            block = pe.prepareBlock();
            pe.jump(block);
            pe.enter(block);
        }

        switchInsn.setDefaultTarget(block);

        pe.constant(callSite.getBootstrapArguments().size()).propagateTo(result);
        pe.jump(joint);
        pe.enter(joint);
        return result.getValue();
konsoletyper commented 9 months ago

Frankly speaking, did not understand your code at all. In general, programs should satisfy following constraints:

  1. Each block must be finished with terminating instruction, i.e. instruction that terminates flow: JumpInstruction, SwitchInstruction, ReturnInstruction, etc.
  2. Variable can only be assigned once.
  3. Variable V can be used in block Q if it was declared in block P and P dominates Q, which means that there are no paths from start block (index 0) to Q that does not contain P.
  4. Inputs to phis match actual incoming blocks in control flow graph.
Ihromant commented 9 months ago

Great, thanks. Now tests is passed.

Ihromant commented 9 months ago

@konsoletyper JDK 21 switch is working now. Interestingly, produced code is very multilevel:

function otv_SwitchTest_switchWithLogic($this, $o) {
        var var$2, $b, $c, $te, $a, var$7, var$8, var$9, var$10, $$je;
        var$2 = 0;
        a: {
            b: {
                c: while (true) {
                    d: {
                        if ($o === null)
                            var$2 = (-1);
                        else {
                            e: {
                                f: {
                                    g: {
                                        h: {
                                            i: {
                                                j: {
                                                    switch (var$2) {
                                                        case 0:
                                                            $o = $rt_nullCheck($o);
                                                            if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$A)) {
                                                                var$2 = 0;
                                                                break d;
                                                            }
                                                            break j;
                                                        case 1:
                                                            break j;
                                                        case 2:
                                                            break i;
                                                        case 3:
                                                            break h;
                                                        case 4:
                                                            break g;
                                                        case 5:
                                                            break f;
                                                        default:
                                                    }
                                                    break e;
                                                }
                                                $o = $rt_nullCheck($o);
                                                if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$A)) {
                                                    var$2 = 1;
                                                    break d;
                                                }
                                            }
                                            $o = $rt_nullCheck($o);
                                            if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$B)) {
                                                var$2 = 2;
                                                break d;
                                            }
                                        }
                                        $o = $rt_nullCheck($o);
                                        if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$C)) {
                                            var$2 = 3;
                                            break d;
                                        }
                                    }
                                    $o = $rt_nullCheck($o);
                                    if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$D)) {
                                        var$2 = 4;
                                        break d;
                                    }
                                }
                                $o = $rt_nullCheck($o);
                                if (jl_Object_getClass($o) === $rt_cls(otv_SwitchTest$TestEnum)) {
                                    var$2 = 5;
                                    break d;
                                }
                            }
                            var$2 = 6;
                        }
                    }
                    k: {
                        switch (var$2) {
                            case -1:
                                var$2 = (-1);
                                break a;
                            case 0:
                                break;
                            case 1:
                                var$2 = 1;
                                break a;
                            case 2:
                                $b = $rt_castToClass($o, otv_SwitchTest$B);
                                $b = $rt_nullCheck($b);
                                $b.$bbf = 21;
                                var$2 = $rt_nullCheck($b.$f).$length();
                                break a;
                            case 3:
                                $c = $rt_castToClass($o, otv_SwitchTest$C);
                                $c = $rt_nullCheck($c);
                                var$2 = $rt_nullCheck($c.$cf).$af;
                                break a;
                            case 4:
                                break k;
                            case 5:
                                $te = $rt_castToClass($o, otv_SwitchTest$TestEnum);
                                $te = $rt_nullCheck($te);
                                var$2 = jl_Enum_ordinal($te);
                                break a;
                            default:
                                $rt_throw(jl_IllegalArgumentException__init_());
                        }
                        $a = $rt_castToClass($o, otv_SwitchTest$A);
                        $a = $rt_nullCheck($a);
                        if (($a.$af & 31) == 5)
                            break c;
                        var$2 = 1;
                        continue c;
                    }
                    var$7 = $rt_castToClass($o, otv_SwitchTest$D);
                    try {
                        var$8 = $rt_nullCheck(var$7);
                        var$9 = otv_SwitchTest$D_a(var$8);
                    } catch ($$e) {
                        $$je = $rt_wrapException($$e);
                        if ($$je instanceof jl_Throwable) {
                            var$7 = $$je;
                            break b;
                        } else {
                            throw $$e;
                        }
                    }
                    var$2 = 0;
                    l: while (true) {
                        m: {
                            var$7 = jl_Byte_valueOf(var$9);
                            if (var$7 === null)
                                var$2 = (-1);
                            else {
                                n: {
                                    switch (var$2) {
                                        case 0:
                                            break;
                                        default:
                                            break n;
                                    }
                                    if (jl_Object_getClass($rt_nullCheck(var$7)) === $rt_cls(jl_Byte)) {
                                        var$2 = 0;
                                        break m;
                                    }
                                }
                                var$2 = 1;
                            }
                        }
                        switch (var$2) {
                            case -1:
                                break l;
                            case 0:
                                break;
                            default:
                                break l;
                        }
                        try {
                            var$2 = otv_SwitchTest$D_b(var$8);
                        } catch ($$e) {
                            $$je = $rt_wrapException($$e);
                            if ($$je instanceof jl_Throwable) {
                                var$7 = $$je;
                                break b;
                            } else {
                                throw $$e;
                            }
                        }
                        var$10 = 0;
                        o: while (true) {
                            p: {
                                var$7 = jl_Short_valueOf(var$2);
                                if (var$7 === null)
                                    var$10 = (-1);
                                else {
                                    q: {
                                        r: {
                                            switch (var$10) {
                                                case 0:
                                                    var$7 = $rt_nullCheck(var$7);
                                                    if (jl_Object_getClass(var$7) === $rt_cls(jl_Short)) {
                                                        var$10 = 0;
                                                        break p;
                                                    }
                                                    break r;
                                                case 1:
                                                    break;
                                                default:
                                                    break q;
                                            }
                                        }
                                        if (jl_Object_getClass($rt_nullCheck(var$7)) === $rt_cls(jl_Short)) {
                                            var$10 = 1;
                                            break p;
                                        }
                                    }
                                    var$10 = 2;
                                }
                            }
                            switch (var$10) {
                                case -1:
                                    break o;
                                case 0:
                                    break;
                                case 1:
                                    break a;
                                default:
                                    break o;
                            }
                            if ((var$2 & 31) == 31) {
                                var$2 = var$9;
                                break a;
                            }
                            var$10 = 1;
                        }
                        var$2 = 1;
                    }
                    var$2 = 5;
                }
                var$2 = 0;
                break a;
            }
            $rt_throw(jl_MatchException__init_(var$7.$toString(), var$7));
        }
        return var$2;
    }
konsoletyper commented 9 months ago

Sure, it is. When I implemented decompiler 10 years ago, I simplified a little bit this switch construct to not support fall-through, because I thought it's a rarely used feature. Sure, it is when writing in Java, but code generators may utilize this quite often. Anyway, this old decompiler should be re-written at some point, I even have an unfinished branch, but it's a tough work with quite low output (say, ~3% of size decrease), so still had no chance to finish it. Anyway, I believe it's a non-issue, especially when you produce obfuscated code for production use.

Are you sure that the work complete? Did you support both methods of SwitchBootstraps? Did you support all the cases? I say there can be numbers, strings, etc, as parameters of bootstrap method.

Ihromant commented 9 months ago

I tried to implement case when there is int/enum and everything works fine. It seems that compiler for this cases generates old-fashioned switch. Code can be added for sure, but without proper tests it would be just guessing.

konsoletyper commented 9 months ago

Ok, I'll try reproducing enum case on my side as well

konsoletyper commented 9 months ago

this sample does not work:

    private int intSwitchWithLogic(Integer o) {
        return switch (o) {
            case 23 -> 1;
            case Integer i when i < 10 -> 2;
            case 42 -> 3;
            default -> 4;
        };
    }

and also this bootstrap method supports string values, so please try similar test with strings.

konsoletyper commented 9 months ago

Following test also fails:

    @Test
    public void hierarchySwitch() {
        assertEquals(1, switchWithHierarchy(new SubclassA(23)));
        assertEquals(2, switchWithHierarchy(new SubclassA(24)));
        assertEquals(2, switchWithHierarchy(new SubclassA(1)));
        assertEquals(1, switchWithHierarchy(new SubclassB(23)));
        assertEquals(3, switchWithHierarchy(new SubclassB(24)));
        assertEquals(4, switchWithHierarchy(new SubclassB(1)));
        assertEquals(5, switchWithHierarchy("foo"));
        assertEquals(5, switchWithHierarchy(new Superclass(1)));
        assertEquals(1, switchWithHierarchy(new Superclass(23)));
    }

    private int switchWithHierarchy(Object o) {
        return switch (o) {
            case Superclass s when s.x == 23 -> 1;
            case SubclassA a -> 2;
            case Superclass s when s.x == 24 -> 3;
            case SubclassB b -> 4;
            default -> 5;
        };
    }

    private static class Superclass {
        final int x;

        public Superclass(int x) {
            this.x = x;
        }
    }

    private static class SubclassA extends Superclass {
        public SubclassA(int x) {
            super(x);
        }
    }

    private static class SubclassB extends Superclass {
        public SubclassB(int x) {
            super(x);
        }
    }
Ihromant commented 9 months ago

@konsoletyper now I suppose that all cases are covered.