Open RalfHerzog opened 4 years ago
Sorry for my slow response. This looks like a clear bug to me. Any chance you could submit a PR with a fix?
Alright, I will look into it in the next days
There seems to be a lot to do to support handling two-word elements on the stack. I just read about the implementation of dup2
(should be somehow handled similar). Have a look at this :stuck_out_tongue: https://github.com/wala/WALA/blob/c3589680ee7403455ead9cd34b76251b31106340/com.ibm.wala.shrike/src/main/java/com/ibm/wala/shrikeBT/DupInstruction.java#L40-L46
The solution by now is not using double
or long
at all??? DupInstruction
s seem to be affected as well for this issue and there might be more.
Either way, I started to fix the issue for pop2
at first. There are many places to fix, especially in the Decoder.java
and Analyzer.java
. Handling two-word elements will affect the overall stack processing and validation. That's what I found out so far
@RalfHerzog what is the scope of this problem? Is Shrike’s bytecode instrumentation broken in cases where the input bytecodes have pop2
and dup2
? Is the WALA SSA IR affected? Thanks for looking into this!
"Broken" seem to be a bit too hard for this problem. In fact, if there is a pop2
instruction in the bytecode, the read shrike method does contain a pop
(single word). It is interesting that, if you did not modify the index where the pop2
is, the instruction is copied and correctly saved to the class file as pop2
. On the other hand, lets say the index of pop2
is 3 as shown in my example above. If I replace the instruction at index 3 with an equal pop2
instruction (-> PopInstruction.make(2)
), the result should be the same, but leads to:
java.lang.IllegalArgumentException: Stack underflow in intermediate code, at offset 3
If I replace it with the previously (falsely) read pop
instruction, the validation is passed but the stack is not empty at the end of the method.
I think there is a discrepancy in the implementation of the two different stacks as described in the jvm specification here (java 11). Cite of $4.10.1.4 (Stack Map Frames and Type Transitions): "Types of size 2 (long and double) are represented by two stack entries, with the first entry being top and the second entry being the type itself. For example, a stack with a double value, an int value, and a long value is represented in a type state as a stack with five entries: top and double entries for the double value, an int entry for the int value, and top and long entries for the long value. Accordingly, OperandStack is the list [top, double, int, top, long]."
Cite of §4.10.2.3 (Values of Types long and double): "Dealing with values of types long or double on the operand stack is simpler; the verifier treats them as single values on the stack. For example, the verification code for the dadd opcode (add two double values) checks that the top two items on the stack are both of type double. When calculating operand stack length, values of type long and double have length two.".
Both places in the jvm spec describe a processing behavior I cannot see in the implementation of shrike, especially the use of an top
-element extending the underlying type to a two-word stack element. Currently, double and long values are treated as single-word elements in the internal stack processing, yielding the "Stack underflow" error above when instrumenting pop2
explicitly. I think I will need some help to implement this in a correct way.
The WALA SSA IR seems not to be affected since pop- and dup-Instructions are omitted during analysis. I checked it with the PDF-examples.
Thanks again for digging into this! Unfortunately I do not have deep expertise in the implementation of Shrike. Further, I don't think we have a lot of users making use of Shrike's bytecode instrumentation functionality. (I think most folks use ASM these days.) That said, if you get stuck trying to fix this bug, I will try to either help myself or track down someone who can help.
Hi again, I recently looked deeper into the code for the necessary places to fix the long/double-support. Unfortunately, I couldn't just replace the stack element counting with stack size counting. Some of the methods are used by the SSA IR and thus not working properly anymore. Another big issue is the dup-Instruction. At first I thought the size parameter indicates the stack element size to duplicate but it is not. That means, that the dup-Instruction is context dependent on the topmost stack element just before their execution. This requires the simulation of the stack elements as well. For example, it is totally fine having such a method:
Constant(I,0) <= Pushes an int with value 0 on the stack (single stack element)
Dup(1,0) <= Duplicates the int value on the stack (single stack element)
{Do something with 2 times int on the stack (2 elements with size >2<)}
...
Constant(J,1) <= Pushes a long with value 1 on the stack (double-sized stack element)
Dup(1,0) <= Duplicates the long value on the stack (double-sized stack element)
{Do something with 2 times long on the stack (2 elements with size >4<)}
I will keep going working on that but any help implementing the functionality in a performant way would be appreciated.
@RalfHerzog thanks for digging further into this. As I mentioned before I’m not an expert on these issues. Maybe you could look at the ASM code to see how they handle these situations?
I am currently struggling with the correct implementation of the dup-instruction for delta grater than 0. A regular dup(1,0) pops off a single element and pushes it back twice, so far so good. I found an explanation at https://www.jrebel.com/blog/java-bytecode-tutorial under "Crunching the Local Stack" for all other dup variants. What makes me wonder is, that the size information (1 or 2 stack elements) must not necessarily met the top stack element size (see the example below in step 6). It seems to be legal to duplicate with size 2 (dup2_x*) even if the top stack elements are single-sized. Further, as far as I understand, the delta information is the offset for the duplicated elements "under" the elements which should be duplicated, right? So dup2_x1 means duplicating a single two-sized element or two single-sized elements in front of the third element. The correct processing must be especially implemented here and here (scroll down if your browser does not jump to the place in the code. It is marked). The main difference to ASM is the object orientated way of simulating the stack which makes the code much easier to understand. For performance reasons I did not consider using objects here. Is the focus of WALA still to be memory efficient so that I should stick to work with arrays of bytes and length?
@RalfHerzog How often do these bytecodes show up? If they are rare, I would be open to an implementation approach that is less performant. But I'm guessing they are not rare, and I don't want to compromise performance of the main static analysis use cases. If you like you can go ahead with a possibly-less-efficient approach, but we should run benchmarks before and after on use cases like pointer analysis before landing it.
Thanks again for putting in so much work on this!
There is some time passed on this issue and I fixed it as far as I need it for the two-word size shrike processing. Currently, 2 test failures are failing (to be exact 3 but one is occurring twice). Here are the two failed tests: first and second
com.ibm.wala.core.tests.callGraph.Java7CallGraphTest > testOcamlHelloHash FAILED
com.ibm.wala.shrikeBT.analysis.Analyzer$FailureException: Expected type [Ljava/lang/Object; at stack 2, got Lorg/ocamljava/runtime/kernel/ValueStack; at offset 27
com.ibm.wala.core.tests.callGraph.KawaCallGraphTest > testKawaChess FAILED
java.lang.Error: OP_swap must always be swapping the same size, 1
I think the first failure is harder to fix than the second one since the object type detection is somehow broken. The second failure is somewhat hard to reproduce for me. I created test classes for my own and the javac compiler seems to prevent generating/compiling swap-Instructions. It always uses store- and load-Instructions instead of swapping.
I would like to offer to continue maintaining this issue but fixing the very last edge case is too time-consuming for me now.
@RalfHerzog we really appreciate all the work you've put in on this issue! Those failing test cases correspond to languages other than Java getting compiled to JVML. So there may be bytecode patterns that are never observed when using javac
to compile Java. We would like the WALA JVML frontend to support any valid bytecode, so we do need to keep those tests passing. Assuming the relevant bytecodes causing the failures pass the JVM bytecode verifier, we should continue to handle them.
That said I completely understand if you don't have time to fix these cases. I do not expect much churn in the code you are modifying. So I think you should be able to maintain your patches in a branch pretty easily and merge in relevant changes from WALA master branch as needed, until we can find a better solution.
Hello, I currently got stuck understanding the PopInstruction size. It seems there are two possible values: 1 (single word) and 2 (double word) which are representing
pop
andpop2
in the jvm bytecode respectively. When there is a method like this:the following bytecode is compiled (output from eclipse internal class file viewer):
When I load the compiled class file with an OfflineInstrumenter, the MethodData holds the following instructions:
I am wondering why there is a Pop(1) for the pushed long value at index 3, since there is a
pop2
bytecode instruction in the compiled file. Is this correct or am I do something wrong?