[Shrike] pop2 opcode not translated to Pop(2) instruction

RalfHerzog commented 4 years ago

Hello, I currently got stuck understanding the PopInstruction size. It seems there are two possible values: 1 (single word) and 2 (double word) which are representing pop and pop2 in the jvm bytecode respectively. When there is a method like this:

public void twoWordPop() {
    Thread.activeCount();       // Pushes an int on the stack (single word)
    System.currentTimeMillis(); // Pushes a long on the stack (double word)
}

the following bytecode is compiled (output from eclipse internal class file viewer):

public void twoWordPop();
  0  invokestatic java.lang.Thread.activeCount() : int [31]
  3  pop
  4  invokestatic java.lang.System.currentTimeMillis() : long [15]
  7  pop2
  8  return

When I load the compiled class file with an OfflineInstrumenter, the MethodData holds the following instructions:

0: Invoke(STATIC,Ljava/lang/Thread;,activeCount,()I)
1: Pop(1)
2: Invoke(STATIC,Ljava/lang/System;,currentTimeMillis,()J)
3: Pop(1)
4: Return(V)

I am wondering why there is a Pop(1) for the pushed long value at index 3, since there is a pop2 bytecode instruction in the compiled file. Is this correct or am I do something wrong?

msridhar commented 4 years ago

Sorry for my slow response. This looks like a clear bug to me. Any chance you could submit a PR with a fix?

RalfHerzog commented 4 years ago

Alright, I will look into it in the next days

RalfHerzog commented 4 years ago

There seems to be a lot to do to support handling two-word elements on the stack. I just read about the implementation of dup2 (should be somehow handled similar). Have a look at this :stuck_out_tongue: https://github.com/wala/WALA/blob/c3589680ee7403455ead9cd34b76251b31106340/com.ibm.wala.shrike/src/main/java/com/ibm/wala/shrikeBT/DupInstruction.java#L40-L46 The solution by now is not using double or long at all??? DupInstructions seem to be affected as well for this issue and there might be more.

Either way, I started to fix the issue for pop2 at first. There are many places to fix, especially in the Decoder.java and Analyzer.java. Handling two-word elements will affect the overall stack processing and validation. That's what I found out so far

msridhar commented 4 years ago

@RalfHerzog what is the scope of this problem? Is Shrike’s bytecode instrumentation broken in cases where the input bytecodes have pop2 and dup2? Is the WALA SSA IR affected? Thanks for looking into this!

RalfHerzog commented 4 years ago

"Broken" seem to be a bit too hard for this problem. In fact, if there is a pop2 instruction in the bytecode, the read shrike method does contain a pop (single word). It is interesting that, if you did not modify the index where the pop2 is, the instruction is copied and correctly saved to the class file as pop2. On the other hand, lets say the index of pop2 is 3 as shown in my example above. If I replace the instruction at index 3 with an equal pop2 instruction (-> PopInstruction.make(2)), the result should be the same, but leads to: java.lang.IllegalArgumentException: Stack underflow in intermediate code, at offset 3 If I replace it with the previously (falsely) read pop instruction, the validation is passed but the stack is not empty at the end of the method.

I think there is a discrepancy in the implementation of the two different stacks as described in the jvm specification here (java 11). Cite of $4.10.1.4 (Stack Map Frames and Type Transitions): "Types of size 2 (long and double) are represented by two stack entries, with the first entry being top and the second entry being the type itself. For example, a stack with a double value, an int value, and a long value is represented in a type state as a stack with five entries: top and double entries for the double value, an int entry for the int value, and top and long entries for the long value. Accordingly, OperandStack is the list [top, double, int, top, long]."

Cite of §4.10.2.3 (Values of Types long and double): "Dealing with values of types long or double on the operand stack is simpler; the verifier treats them as single values on the stack. For example, the verification code for the dadd opcode (add two double values) checks that the top two items on the stack are both of type double. When calculating operand stack length, values of type long and double have length two.".

Both places in the jvm spec describe a processing behavior I cannot see in the implementation of shrike, especially the use of an top-element extending the underlying type to a two-word stack element. Currently, double and long values are treated as single-word elements in the internal stack processing, yielding the "Stack underflow" error above when instrumenting pop2 explicitly. I think I will need some help to implement this in a correct way.

The WALA SSA IR seems not to be affected since pop- and dup-Instructions are omitted during analysis. I checked it with the PDF-examples.

msridhar commented 4 years ago

Thanks again for digging into this! Unfortunately I do not have deep expertise in the implementation of Shrike. Further, I don't think we have a lot of users making use of Shrike's bytecode instrumentation functionality. (I think most folks use ASM these days.) That said, if you get stuck trying to fix this bug, I will try to either help myself or track down someone who can help.

RalfHerzog commented 4 years ago

Hi again, I recently looked deeper into the code for the necessary places to fix the long/double-support. Unfortunately, I couldn't just replace the stack element counting with stack size counting. Some of the methods are used by the SSA IR and thus not working properly anymore. Another big issue is the dup-Instruction. At first I thought the size parameter indicates the stack element size to duplicate but it is not. That means, that the dup-Instruction is context dependent on the topmost stack element just before their execution. This requires the simulation of the stack elements as well. For example, it is totally fine having such a method:

Constant(I,0)  <= Pushes an int with value 0 on the stack (single stack element)
Dup(1,0)       <= Duplicates the int value on the stack (single stack element)
{Do something with 2 times int on the stack (2 elements with size >2<)}
...
Constant(J,1)  <= Pushes a long with value 1 on the stack (double-sized stack element)
Dup(1,0)       <= Duplicates the long value on the stack (double-sized stack element)
{Do something with 2 times long on the stack (2 elements with size >4<)}

I will keep going working on that but any help implementing the functionality in a performant way would be appreciated.

msridhar commented 4 years ago

@RalfHerzog thanks for digging further into this. As I mentioned before I’m not an expert on these issues. Maybe you could look at the ASM code to see how they handle these situations?

RalfHerzog commented 4 years ago

I am currently struggling with the correct implementation of the dup-instruction for delta grater than 0. A regular dup(1,0) pops off a single element and pushes it back twice, so far so good. I found an explanation at https://www.jrebel.com/blog/java-bytecode-tutorial under "Crunching the Local Stack" for all other dup variants. What makes me wonder is, that the size information (1 or 2 stack elements) must not necessarily met the top stack element size (see the example below in step 6). It seems to be legal to duplicate with size 2 (dup2_x*) even if the top stack elements are single-sized. duplicating-values-stack Further, as far as I understand, the delta information is the offset for the duplicated elements "under" the elements which should be duplicated, right? So dup2_x1 means duplicating a single two-sized element or two single-sized elements in front of the third element. The correct processing must be especially implemented here and here (scroll down if your browser does not jump to the place in the code. It is marked). The main difference to ASM is the object orientated way of simulating the stack which makes the code much easier to understand. For performance reasons I did not consider using objects here. Is the focus of WALA still to be memory efficient so that I should stick to work with arrays of bytes and length?

msridhar commented 4 years ago

@RalfHerzog How often do these bytecodes show up? If they are rare, I would be open to an implementation approach that is less performant. But I'm guessing they are not rare, and I don't want to compromise performance of the main static analysis use cases. If you like you can go ahead with a possibly-less-efficient approach, but we should run benchmarks before and after on use cases like pointer analysis before landing it.

Thanks again for putting in so much work on this!

RalfHerzog commented 4 years ago

There is some time passed on this issue and I fixed it as far as I need it for the two-word size shrike processing. Currently, 2 test failures are failing (to be exact 3 but one is occurring twice). Here are the two failed tests: first and second

com.ibm.wala.core.tests.callGraph.Java7CallGraphTest > testOcamlHelloHash FAILED
    com.ibm.wala.shrikeBT.analysis.Analyzer$FailureException: Expected type [Ljava/lang/Object; at stack 2, got Lorg/ocamljava/runtime/kernel/ValueStack; at offset 27

com.ibm.wala.core.tests.callGraph.KawaCallGraphTest > testKawaChess FAILED
    java.lang.Error: OP_swap must always be swapping the same size, 1

I think the first failure is harder to fix than the second one since the object type detection is somehow broken. The second failure is somewhat hard to reproduce for me. I created test classes for my own and the javac compiler seems to prevent generating/compiling swap-Instructions. It always uses store- and load-Instructions instead of swapping.

I would like to offer to continue maintaining this issue but fixing the very last edge case is too time-consuming for me now.

msridhar commented 4 years ago

@RalfHerzog we really appreciate all the work you've put in on this issue! Those failing test cases correspond to languages other than Java getting compiled to JVML. So there may be bytecode patterns that are never observed when using javac to compile Java. We would like the WALA JVML frontend to support any valid bytecode, so we do need to keep those tests passing. Assuming the relevant bytecodes causing the failures pass the JVM bytecode verifier, we should continue to handle them.

That said I completely understand if you don't have time to fix these cases. I do not expect much churn in the code you are modifying. So I think you should be able to maintain your patches in a branch pretty easily and merge in relevant changes from WALA master branch as needed, until we can find a better solution.

wala / WALA

[Shrike] pop2 opcode not translated to Pop(2) instruction #753