usethesource / flybytes

Flybytes is an intermediate language between JVM bytecode and software languages (DSLs, PLs), for compilation and decompilation.
BSD 2-Clause "Simplified" License
16 stars 6 forks source link

NullPointerException: Cannot read field "inputLocals" because "dstFrame" is null #20

Closed bys1 closed 1 year ago

bys1 commented 1 year ago

I am trying to compile the following:

Method getAdd() {
    return staticMethod(\public(), string(), "add", [
        var(string(), "x")
    ], [
        \asm([LABEL("LALALA"), GOTO("STERF")]),
        \block([\return(load("x"))], label = "STERF"),
        \return(load("x"))
    ]);
}

I am messing around with labels (final goal: making a string switch). Compiling the above works fine if I remove the GOTO from the asm part. However, with the GOTO, I get the following error:

|jar+file:///Users/bys1/.m2/repository/org/rascalmpl/flybytes/0.1.7/flybytes-0.1.7.jar!/src/lang/flybytes/Compiler.rsc|(1637,272,<25,0>,<27,123>): Java(
  "RuntimeException",
  "java.lang.NullPointerException: Cannot read field \"inputLocals\" because \"dstFrame\" is null",
  Java("NullPointerException","Cannot read field \"inputLocals\" because \"dstFrame\" is null"))
    at lang.flybytes.internal.ClassCompiler.compileClass(|unknown:///ClassCompiler.java|(0,0,<109,0>,<109,0>))
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(|unknown:///NativeMethodAccessorImpl.java|(0,0,<0,0>,<0,0>))
    at compileClass(|jar+file:///Users/bys1/.m2/repository/org/rascalmpl/flybytes/0.1.7/flybytes-0.1.7.jar!/src/lang/flybytes/Compiler.rsc|(1902,5,<27,116>,<27,121>))
    at $shell$(|prompt:///|(0,10,<1,0>,<1,10>)ok
rascal>

Am I doing something wrong, or is this a bug in Flybytes? The error is not very descriptive to me.

jurgenvinju commented 1 year ago

What is going wrong here is that asm([]) instruction lists may be produced by the decompiler and disassembler, but the compiler can not compile them back to bytecode because there are missing FRAME instructions.

See https://asm.ow2.io/javadoc/org/objectweb/asm/tree/FrameNode.html. From that page:

These nodes are pseudo instruction nodes in order to be inserted in an instruction list. In fact, these nodes must(*) be inserted just before any instruction node i that follows an unconditional branch instruction such as GOTO or THROW, that is the target of a jump instruction, or that starts an exception handler block.

And so, in order for flybytes to support any kind of code generation from the asm blocks, we either have to add an inference stage for the FRAME instructions, or to ask the user to put them in for us. I don't know yet what the best solution is.

If you want to generate switch on strings, I recommend using flybytes switch construct directly.

bys1 commented 1 year ago

The flybytes switch only supports integer keys. But I solved it by creating a switch on the hashCode and then performing equals checks in each case, which is similar to how a Java string switch is compiled to bytecode.

jurgenvinju commented 1 year ago

Yes, that's the way it's done. nice.

I'm considering to add frame instructions to the disassembler and the compiler to enable to study how and why they work. Currently we are using ASM's feature to infer them automatically while serializing the bytecode, however, not all bytecode instruction lists are allowed semantically, which are allowed syntactically by flybytes asm instruction. See for example this discussion on stackoverflow: https://stackoverflow.com/questions/41208039/classwriter-compute-frames-in-asm

jurgenvinju commented 1 year ago

The experiments show that the ASM library, even the latest versions, are not able to recover the Frame nodes without throwing exceptions.

So after a little more digging I found out what was breaking the pre-conditions for inferring the proper FrameNodes during writing of the example class (provided above by @bys1). The block implementation did not cater for manually added label fields, and skipped generation of the LABEL instruction. That way there was a GOTO to a non-existent label, and ASM simply crashed on that.

Have to consider if we can detect this general situation without too much overhead and provide a better error message. In the meantime we can fix this particular cause of a missing label.