janino-compiler / janino

Janino is a super-small, super-fast Java™ compiler.
http://janino-compiler.github.io/janino
Other
1.21k stars 205 forks source link

Byte code is 350 times larger due to StackMapTable, JVM crashes when loading the class #174

Closed dgloeckner closed 7 months ago

dgloeckner commented 1 year ago

Problem

If a method has a very large body, the byte code produced by Janino is so huge that it kills JVMs >= 16.

Test case

I created a test case to easily reproduce the behaviour. Just run it on any recent OpenJDK and see the JVM crash. https://github.com/janino-compiler/janino/commit/3bd4559ffb87d1d0b326e53270ae1967c5dc6952

Janino version 3.0.11 works just fine for us. The problem appears after this commit: https://github.com/janino-compiler/janino/commit/91cfcb90869db5efeb53c1fd13b9d6cfa6374165#diff-9b50ca96aaf59d113be01aeb317c57e189548be8d8d234a08202a6168dd64c7e

Suggestion to move forward

My suggestion would be to make the generation of StackMapTables optional.

How it crashes

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (metaspaceArena.cpp:88), pid=13664, tid=6659
#  guarantee(requested_word_size <= chunklevel::MAX_CHUNK_WORD_SIZE) failed: Requested size too large (756627) - max allowed size per allocation is 524288.
#
# JRE version: OpenJDK Runtime Environment (18.0.1.1+2) (build 18.0.1.1+2-6)
# Java VM: OpenJDK 64-Bit Server VM (18.0.1.1+2-6, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, bsd-amd64)
# No core dump will be written. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#

JEP 387: Elastic Metaspace

aunkrig commented 1 year ago

Wow, I've never seen that JVM error message before.

Well, the problem with the StackMapTable attribute is that it came straight out of hell - it is extremely complicated to generate, and the JVMS is quite unclear with the details. It totally destroys the simplicity of generating Java byte code - Janino is now widely polluted with code that generates and updates it. If I had one wish free in life, one hot candidate would be that Oracle made that attribute optional again, or even better, never introduced it.

The attribute was introduced with Java 6, and is mandatory since Java 7. Thus Janino generated Java 6 .class files for a very long time, but eventually I was forced to implement StackMapTable, because static interface methods and interface default methods are only possible with Java 8+ .class files. And I hated it!!

You can enforce the generation of Java 6 .class files with

-Dorg.codehaus.janino.UnitCompiler.defaultTargetVersion=6

(if your code doesn't use the JVM features described above). Also I have read that the -noverify command line option of java makes the JVM ignore the StackMapTable attribute.

Is it possible for you to run the JVM with -noverify?

aunkrig commented 1 year ago

The headline "Byte code is 350 times larger due to StackMapTable" is misleading: Your two code snippets differ in size by a factor of 100 (10 local variables vs. 1000 local variables), which results in 100x more stack maps; but also each of these stack maps is 100x larger, so the size of the stack maps is effectively 10.000x. So that's why the .class file is so large.

Please compliment Oracle on the design of the SMT attribute!

So why is the .class file generated by JAVAC so much smaller that the one generated by Janino? Because JAVAC doesn't store stack map for all instruction offsets, but only for some. But the JVMS does not clearly specify where in the bytecode a stack map is mandatory and where it is optional. It seems like a stack map is only mandatory at "branch targets" (which you don't have in your code), but I may be wrong.

Thus, Janino creates stack maps on every instruction, and in your special case (lots local variables) these stack maps are numerous and huge.

Hint: The stack maps are probably much smaller iff you don't declare the 1000 local variables in one scope, but, if possible, declare them in separate scopes, like this:

{
    String current1 = "HELLO";
    // ...
}
{
    String current2 = "FOO";
    // ...
}

Does that help?

dgloeckner commented 1 year ago

Howdy,

Thanks for looking into this!

Wow, I've never seen that JVM error message before.

Fortunately we caught it in tests before it went in production but yes, it's quite horrible ;)

We could set -noverify but I have a bit of mixed feelings about it. For now I created a custom version of Apache Calcite which is the library we are using that internally uses Janino. See https://issues.apache.org/jira/browse/CALCITE-5227

dgloeckner commented 1 year ago

The headline "Byte code is 350 times larger due to StackMapTable" is misleading

I think I compared class file size on disk for both versions. At the end of the day the difference is big enough to make us hit a JVM-internal limit in the metaspace ;)

Please compliment Oracle on the design of the SMT attribute!

Seems to be a fantastic feature and design.

Do you see any chance to make Janino behave more like javac and produce fewer stack maps?

Besides this potential improvement in Janino I see 2 ways forward and both are related to the Calcite project mentioned in my previous post and outside of Janino:

aunkrig commented 1 year ago

Now I mark all basic blocks in the CodeContext, and generate stack frames only for these. And Bob's your uncle.

I moved your test case into org.codehaus.commons.compiler.tests.ReportedBugsTest.testIssue174__BigMethod() for future regression testing.

Please test!

dgloeckner commented 1 year ago

@aunkrig thanks a lot for more reverse engineering of this beloved JVM feature. I’ll build a Calcite version with Janino built from your branch and re-test. I’m on vacation atm so it might take a few days.

aunkrig commented 1 year ago

Please let me know if/when you need a Janino release.

dgloeckner commented 1 year ago

@aunkrig I ran extensive tests for our Calcite-based code which relies heavily on Janino as part of Calcite's internal implementation and the results look good! It would be great if you could release a new Janino version.

aunkrig commented 1 year ago

I will do it ASAP. Currently I am still working on another issue.

julianhyde commented 1 year ago

@aunkrig I really appreciate you prioritizing this issue.

But much more important, I wanted to say thank you from the Apache Calcite community for your work on Janino. Janino is a crucial part of Calcite's run time, and it does its job so well that most users are not aware that it exists. We have used Janino for a long time, and it has always been performant and reliable.

aunkrig commented 1 year ago

That other issue is now fixed. Janino now produces .class-file-format-8-compliant .class files, yippie! Will publish a Janino release soon.

aunkrig commented 1 year ago

Janino 3.1.8 is out the door!

dgloeckner commented 1 year ago

@aunkrig awesome! Thanks heaps for dealing with this issue and for the nice conversations also.

aunkrig commented 1 year ago

My pleasure. And thanks for your patience.

julianhyde commented 1 year ago

Of the two test cases that @dgloeckner wrote to demonstrate this issue in Calcite, one was fixed by upgrading to 3.1.8 and one was not. The error stack produced by the JVM crash looks very similar in the two cases. Calcite will upgrade to Janino 3.1.8 in release 1.32.0 (appearing in a few days) - we figure that it will, at worst, do no harm - but this issue perhaps needs more investigation. See my recent comments in https://issues.apache.org/jira/browse/CALCITE-5227 for more details.

aunkrig commented 1 year ago

Hey Julian,

I see only one test case in this issue (the BigMethodTest, at the far top). Did I miss the second onewhich you refer to?

CU Arno

julianhyde commented 1 year ago

The other test case (which @dgloeckner references in the description of CALCITE-5227) is JaninoCompilerTest.

My branch https://github.com/julianhyde/calcite/tree/5227-janino-wide has three commits, two of which are those test cases. With that branch, under JDK 18, the following command is probably sufficient to reproduce the JVM crash:

./gradlew clean :core:test --tests JaninoCompilerTest
aunkrig commented 1 year ago

Thanks for providing the link... I tried to eliminate the dependencies to CALCITE and came up with this:

    @Test
    public void
    testIssue174__JaninoCompilerTest() throws Exception {

//        // public void compile_long_method(@TempDir Path tempDir) throws Exception {
        String code = ReportedBugsTest.getClassCode(ReportedBugsTest.getClassBody2(1000));
//        JaninoCompiler compiler = new JaninoCompiler();
//        compiler.getArgs().setDestdir(tempDir.toFile().getAbsolutePath());
//        compiler.getArgs().setSource(code, "Nuke.java");
//        compiler.getArgs().setFullClassName("Nuke");
//        compiler.compile();

//        URL[] urls = new URL[]{tempDir.toUri().toURL()};
//        ClassLoader cl = new URLClassLoader(urls);
//        ClassLoader cl = sc.getClassLoader();
        AbstractJavaSourceClassLoader cl = this.compilerFactory.newJavaSourceClassLoader();
        cl.setSourceFinder(new MapResourceFinder(Collections.singletonMap("Nuke.java", code.getBytes("UTF-8"))));

        Class<?> clazz = cl.loadClass("Nuke");
        Object o = clazz.getDeclaredConstructor().newInstance();
        System.out.println(o);
    }

    private static String getClassCode(String body) {
        return "import java.util.Arrays;\n\n"
            + "public class Nuke {"
            + body
            + "}";
    }

    private static String getClassBody2(int numAssignments) {
        String template = "public Object[] nuke() {\n"
            + "\tString bloat = \"some_bloat\";\n"
            + "\t//Lots of variables\n"
            + "%s"
            + "\t//Big array initialization\n"
            + "\treturn new Object[]\n"
            + "\t{\n"
            + "%s"
            + "\t};\n"
            + "}\n"
            + "\n"
            + ""
            + "public static void main(String[] args) throws Exception {"
            + "\tSystem.out.println(Arrays.toString(new Nuke().nuke()));"
            + "}";
        StringBuilder assignments = new StringBuilder();
        StringBuilder appends = new StringBuilder();
        for (int i = 0; i < numAssignments; i++) {
            assignments.append(String.format("\tfinal String current%s = bloat;\n", i));
            appends.append(String.format("\t\tcurrent%s", i));
            if (i < numAssignments - 1) {
                appends.append(",");
            }
            appends.append("\n");
        }
        return String.format(template, assignments, appends);
    }

However this executes successfully with JRE 18. (Didn't know anybody is seriously using Java 18!? ;-)).

Can you please provide a (failing) test case without the dependencies to CALCITE? Or at least a stack trace that shows how Janino fails? Otherwise we won't get any further soon...

julianhyde commented 1 year ago

I'm only running JDK 18 so that I can say that Calcite runs on any JDK. I'm not typical. :)

I think it occurred on some other recent JDK versions, too.

@dgloeckner Could you please help Arno out with a standalone program? It might be sufficient to capture the Java string that Calcite generates and make that a String literal in a standalone program.

aunkrig commented 1 year ago

Ping.

aunkrig commented 1 year ago

Ping.

aunkrig commented 10 months ago

Ping.