SpongePowered / Mixin

Mixin is a trait/mixin and bytecode weaving framework for Java using ASM
MIT License
1.4k stars 190 forks source link

Using ModifyConstant with constant 0 on method that returns boolean may lead to JVM-dependent behavior (undefined behavior?) #229

Closed Barteks2x closed 6 years ago

Barteks2x commented 6 years ago

I know this is all a consequence of a feature I requested myself... I can deal with it but this may be something mixin could try to at least detect and warn about.

Consider the following source bytecode (asm format as shown by intellij bytecode outline plugin):

  public function()Z
    // some other code before that
    ICONST_0
    IRETURN

(you can find example of such bytecode in Chunk#isEmptyBetween in Minecraft)

and consider the following @ModifyConstant:

    @ModifyConstant(method = "function", constant = @Constant(intValue = 0, expandZeroConditions = Constant.Condition.LESS_THAN_ZERO))
    private int getAreLevelsEmpty_getMinY(int origY) {
        return -12345;
    }

Turns out, all the specification says about interpreting boolean values is that 1 means true, and 0 means false and that compilers must follow this.

So interpretations of any other values are not defined, and JVMs are free to interpret them in any way.

I recently had to debug an issue caused by this: I accidentally replaced the false in isEmptyBetween with world.getMinHeight() which is negative value in my mod. My JVM, as almost all other JVMs, apparently interpreted this value as false (because the code worked). But I got a report on discord from a user that had an issue where no blocks would render. Apparently his JVM (Oracle 1.8.0_25) on his machine interpreted it as true and rendering code thought everything is empty.

I realize that this is something that can't be detected in general. In this simple case it could easily be detected but as soon as local variables are involved it would become very complex if not impossible to reliably detect.

I really don't know if there is anything mixin can do to fix it. Aside of detecting and warning about the simple cases I think the only thing mixin could (maybe) do is try to make it defined behavior by making such boolean method return value != 0. But it may have it's own performance cost and would make that little part of mixin potentially even more confusing.

Pokechu22 commented 6 years ago

Shouldn't it be possible to just refuse changing a Boolean to an int? I think the localvariabletable knows the actual type (or, maybe it's in the stackmap?)

I'll have to look this up -- I don't know for sure.

Barteks2x commented 6 years ago

it is technically valid for the bytecode to return an int local variable as boolean, as long as at runtime it will always be either 0 or 1. javac may never generate such code, but other compilers, or even other coremods may do that.

Pokechu22 commented 6 years ago

Ah, and it turns out that the verifier doesn't actually care about the difference between Booleans and ints:

  • The primitive types byte, char, short, and boolean (field descriptors B, C, S, Z) all correspond to the verification type int. (jvms 4.10.1.2)

So that wouldn't work :/

JonathanxD commented 6 years ago

I recently had to debug an issue caused by this: I accidentally replaced the false in isEmptyBetween with world.getMinHeight() which is negative value in my mod. My JVM, as almost all other JVMs, apparently interpreted this value as false (because the code worked). But I got a report on discord from a user that had an issue where no blocks would render. Apparently his JVM (Oracle 1.8.0_25) on his machine interpreted it as true and rendering code thought everything is empty.

This is weird statement because JVM does not interpret boolean values, javac compiles boolean checks (if (boolean) or boolean ?:) to a check against zero values (ifeq to check ==0 and ifne to check !=0 and branch to ouside of if. ref). When a method return a boolean value, JVM only puts the value on stack as-is without any interpretation/modification (same for field access).

I think that world.getMinHeight() may return 0 or 1 (naturally or another coremod modified the method).

Its only a note

Barteks2x commented 6 years ago

world.getMinHeight is a methoid I add and I know for a fact that it returns Integer.MIN_VALUE/2. And what I said may have something to do with JVM optimizer assuming a boolean is always either 0 or 1.

I forgot it's javac actually generating the comparisons but it's also a fact that I actually saw one user who had it behave differently. So the only explanation that I can see is JVM optimizations messing with it.

Mumfrey commented 6 years ago

Yeah this behaviour being undefined kind of sucks, I wish Java treated booleans like C where 0 means false and everything else is true because at least that would be consistent.

There are little tips of the hat to the fact that everything shorter than an int is still an int internally sprinkled throughout Mixin, eg. in features like @Coerce which lets consumers deal with the LVT being missing because frames only contain I for all integer-based types.

The problem really is that an Injection Point is basically a query against bytecode instructions, and only a very small number of the Injection Points are actually contextual in that they pay attention to their neighbouring instructions. The ModifyConstant obviously has the "treat comparisons as zero" feature and some others have limited statefulness, but there are none which currently do any static analysis.

At the moment, all a ModifyConstant Injector cares about is that the incoming type is the same as the outgoing type, in this case int, it doesn't know what the return value is "used for" and some static analysis of the surrounding code would be required in order to obtain the information. There are some obvious cases:

But that's about it without kicking off a full-on static analysis of the method to see where the value goes.

I realize that this is something that can't be detected in general. In this simple case it could easily be detected but as soon as local variables are involved it would become very complex if not impossible to reliably detect.

You are exactly right. I feel like having the injection point not match these instructions isn't going to work because it creates inconsistent behaviour unless the detection is 100% reliable, which it likely won't be. It certainly seems like doing some validation for the obvious cases and reporting them would be worthwhile though.

I forgot it's javac actually generating the comparisons but it's also a fact that I actually saw one user who had it behave differently. So the only explanation that I can see is JVM optimizations messing with it.

The instruction used to examine the return value is IFNE so one would assume that this would result in C-like behaviour. For the behaviour to be nondeterministic I would agree that optimisations are the most likely cause.

Other than detecting the simple cases mentioned above and warning about them, I don't know how else I could handle this since it's not really a "bug" per se, it's a side effect of JVM design.

Mumfrey commented 6 years ago

The injector now does a very basic static analysis of the code and attempts to find the obvious scenarious where a narrowing conversion is occurring, static analysis is only performed and warning emitted when debug verbose is enabled, for performance reasons. This should allow developers to find these issues but won't impact end users.