sirthias / parboiled

Elegant parsing in Java and Scala - lightweight, easy-to-use, powerful.
http://parboiled.org
Apache License 2.0
1.28k stars 156 forks source link

Parboiled stopped working in Java 16 due to enforcement of encapsulation #175

Closed lukehutch closed 2 years ago

lukehutch commented 3 years ago

Code that depends upon Parboiled2 no longer works with Java 16, because encapsulation is now enforced. The following exception is thrown:

Exception in thread "main" java.lang.RuntimeException: Error creating extended parser class: Could not determine whether class 'mypackage.MyParser$$parboiled' has already been loaded
    at org.parboiled.Parboiled.createParser(Parboiled.java:58)
    at javaparse.JavaParsers.benchmarkParboiled1_java(JavaParsers.java:25)
    at squirrel.TestUtils.findMinTime(TestUtils.java:46)
    at squirrel.TestJavaParsing.main(TestJavaParsing.java:17)
Caused by: java.lang.RuntimeException: Could not determine whether class 'mypackage.MyParser$$parboiled' has already been loaded
    at org.parboiled.transform.AsmUtils.findLoadedClass(AsmUtils.java:217)
    at org.parboiled.transform.ParserTransformer.transformParser(ParserTransformer.java:35)
    at org.parboiled.Parboiled.createParser(Parboiled.java:54)
    ... 3 more
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make protected final java.lang.Class java.lang.ClassLoader.findLoadedClass(java.lang.String) accessible: module java.base does not "opens java.lang" to unnamed module @3c5a99da
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:357)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
    at java.base/java.lang.reflect.Method.checkCanSetAccessible(Method.java:199)
    at java.base/java.lang.reflect.Method.setAccessible(Method.java:193)
    at org.parboiled.transform.AsmUtils.findLoadedClass(AsmUtils.java:210)
    ... 5 more

The problematic code in org.parboiled.transform.AsmUtils.findLoadedClass is the call to findLoadedClassMethod.setAccessible(true) here:

    public static Class<?> findLoadedClass(String className, ClassLoader classLoader) {
        checkArgNotNull(className, "className");
        checkArgNotNull(classLoader, "classLoader");
        try {
            Class<?> classLoaderBaseClass = Class.forName("java.lang.ClassLoader");
            Method findLoadedClassMethod = classLoaderBaseClass.getDeclaredMethod("findLoadedClass", String.class);

            // protected method invocation
            findLoadedClassMethod.setAccessible(true);
            try {
                return (Class<?>) findLoadedClassMethod.invoke(classLoader, className);
            } finally {
                findLoadedClassMethod.setAccessible(false);
            }
        } catch (Exception e) {
            throw new RuntimeException("Could not determine whether class '" + className +
                    "' has already been loaded", e);
        }
    }

Note that to even get this far you have to override the version of ASM depended upon by Parboiled to one that works with JDK 16:

    <dependency>
      <groupId>org.parboiled</groupId>
      <artifactId>parboiled-java</artifactId>
      <version>1.3.1</version>
      <scope>test</scope>
      <exclusions>
        <exclusion>
          <artifactId>asm</artifactId>
          <groupId>org.ow2.asm</groupId>
        </exclusion>
      </exclusions>
    </dependency>
    <dependency>
      <groupId>org.ow2.asm</groupId>
      <artifactId>asm</artifactId>
      <version>9.1</version>
      <scope>test</scope>
    </dependency>

Java 15 is dramatically faster than previous JVMs, and Java 16 has record types, so Java 16 will fast become a new baseline for many Java projects. Parboiled is still a dependency for a lot of old code. It would be nice if Parboiled could be updated to not use introspection in this way so that it works with Java 16+.

lukasraska commented 3 years ago

FWIW this can be worked around by setting --illegal-access=permit to the running JVM, although the access restrictions exist for a reason

lukasraska commented 3 years ago

I've had some time do look into this a bit deeper:

  1. --illegal-access=permit will work as workaround for Java 16, however it will be removed in Java 17 (which is the upcoming LTS version - https://openjdk.java.net/jeps/403)
  2. The troublemaker in this case is ClassLoader and its methods findLoadedClass / defineClass - which does class lookup / new class definition from bytecode produced by ASM

When Parboiled gets rid of the usage of those two methods, it starts working without any issues. How to achieve this is however a little problematic:

So far I could think of three possible scenarios:

  1. Replace AsmUtils.findLoadedClass and AsmUtils.loadClass implementation directly with MethodHandles.lookup().* methods a. Pros: change in only few lines, easy fix b. Cons: all parsers need to be defined in org.parboiled.transform package, otherwise it won't work

  2. Replace AsmUtils.findLoadedClass and AsmUtils.loadClass implementation directly with MethodHandles.lookup().* and update ASM to generate new updated classes inside org.parboiled.transform, not in the original parser package a. Pros: works without any necessary modification to existing parsers b. Cons: all classes that are defined dynamically need to be generated in different package and all related calls need to be updated as well (class can be "renamed" with ClassVisitor, but all references to the original class need to be updated / copied, otherwise similar IllegalAccessException will be thrown) -> I suspect this would require major rewrite of the ASM handling

  3. Replace AsmUtils.findLoadedClass and AsmUtils.loadClass implementation with calls to similar static methods defined in parser class (see below) a. Pros: change in only few lines, easy fix b. Cons: all parsers need to define their own static findLoadedClass and loadClass static methods (creating custom method in BaseParser doesn't help as it gets detected as different package)

All options mentioned above require at least Java 9, because that's when Lookup.defineClass was added (https://docs.oracle.com/javase/9/docs/api/java/lang/invoke/MethodHandles.Lookup.html#defineClass-byte:A- )

The third option can be achieved with following patch:

diff --git a/build.sbt b/build.sbt
index c1bc1a6d..0397bc96 100644
--- a/build.sbt
+++ b/build.sbt
@@ -14,8 +14,8 @@ val basicSettings = Seq(

   javacOptions          ++= Seq(
     "-deprecation",
-    "-target", "1.7",
-    "-source", "1.7",
+    "-target", "11",
+    "-source", "11",
     "-encoding", "utf8",
     "-Xlint:unchecked"
   ),
diff --git a/parboiled-java/src/main/java/org/parboiled/transform/AsmUtils.java b/parboiled-java/src/main/java/org/parboiled/transform/AsmUtils.java
index b1f9e9e7..621c84d0 100644
--- a/parboiled-java/src/main/java/org/parboiled/transform/AsmUtils.java
+++ b/parboiled-java/src/main/java/org/parboiled/transform/AsmUtils.java
@@ -199,20 +199,11 @@ class AsmUtils {
      * @param classLoader the class loader to use
      * @return the class instance or null
      */
-    public static Class<?> findLoadedClass(String className, ClassLoader classLoader) {
+    public static Class<?> findLoadedClass(String className, ClassLoader classLoader, Class<?> origClass) {
         checkArgNotNull(className, "className");
-        checkArgNotNull(classLoader, "classLoader");
+        checkArgNotNull(origClass, "origClass");
         try {
-            Class<?> classLoaderBaseClass = Class.forName("java.lang.ClassLoader");
-            Method findLoadedClassMethod = classLoaderBaseClass.getDeclaredMethod("findLoadedClass", String.class);
-
-            // protected method invocation
-            findLoadedClassMethod.setAccessible(true);
-            try {
-                return (Class<?>) findLoadedClassMethod.invoke(classLoader, className);
-            } finally {
-                findLoadedClassMethod.setAccessible(false);
-            }
+            return (Class<?>) origClass.getMethod("findLoadedClass", String.class).invoke(null, className);
         } catch (Exception e) {
             throw new RuntimeException("Could not determine whether class '" + className +
                     "' has already been loaded", e);
@@ -231,22 +222,11 @@ class AsmUtils {
      * @param classLoader the class loader to use
      * @return the class instance
      */
-    public static Class<?> loadClass(String className, byte[] code, ClassLoader classLoader) {
-        checkArgNotNull(className, "className");
+    public static Class<?> loadClass(String className, byte[] code, ClassLoader classLoader, Class<?> origClass) {
         checkArgNotNull(code, "code");
-        checkArgNotNull(classLoader, "classLoader");
+        checkArgNotNull(origClass, "origClass");
         try {
-            Class<?> classLoaderBaseClass = Class.forName("java.lang.ClassLoader");
-            Method defineClassMethod = classLoaderBaseClass.getDeclaredMethod("defineClass",
-                    String.class, byte[].class, int.class, int.class);
-
-            // protected method invocation
-            defineClassMethod.setAccessible(true);
-            try {
-                return (Class<?>) defineClassMethod.invoke(classLoader, className, code, 0, code.length);
-            } finally {
-                defineClassMethod.setAccessible(false);
-            }
+            return (Class<?>) origClass.getMethod("loadClass", byte[].class).invoke(null, code);
         } catch (Exception e) {
             throw new RuntimeException("Could not load class '" + className + '\'', e);
         }
diff --git a/parboiled-java/src/main/java/org/parboiled/transform/GroupClassGenerator.java b/parboiled-java/src/main/java/org/parboiled/transform/GroupClassGenerator.java
index eeeb4d11..2e534681 100644
--- a/parboiled-java/src/main/java/org/parboiled/transform/GroupClassGenerator.java
+++ b/parboiled-java/src/main/java/org/parboiled/transform/GroupClassGenerator.java
@@ -58,12 +58,12 @@ abstract class GroupClassGenerator implements RuleMethodProcessor {

         Class<?> groupClass;
         synchronized (lock) {
-            groupClass = findLoadedClass(className, classLoader);
+            groupClass = findLoadedClass(className, classLoader, classNode.getParentClass());
             if (groupClass == null || forceCodeBuilding) {
                 byte[] groupClassCode = generateGroupClassCode(group);
                 group.setGroupClassCode(groupClassCode);
                 if (groupClass == null) {
-                    loadClass(className, groupClassCode, classLoader);
+                    loadClass(className, groupClassCode, classLoader, classNode.getParentClass());
                 }
             }
         }
diff --git a/parboiled-java/src/main/java/org/parboiled/transform/ParserTransformer.java b/parboiled-java/src/main/java/org/parboiled/transform/ParserTransformer.java
index b31e4f2f..6ce19861 100644
--- a/parboiled-java/src/main/java/org/parboiled/transform/ParserTransformer.java
+++ b/parboiled-java/src/main/java/org/parboiled/transform/ParserTransformer.java
@@ -33,7 +33,7 @@ public class ParserTransformer {
         checkArgNotNull(parserClass, "parserClass");
         // first check whether we did not already create and load the extension of the given parser class
         Class<?> extendedClass = findLoadedClass(
-                getExtendedParserClassName(parserClass.getName()), parserClass.getClassLoader()
+                getExtendedParserClassName(parserClass.getName()), parserClass.getClassLoader(), parserClass
         );
         return (Class<? extends T>)
                 (extendedClass != null ? extendedClass : extendParserClass(parserClass).getExtendedClass());
@@ -44,7 +44,7 @@ public class ParserTransformer {
         new ClassNodeInitializer().process(classNode);
         runMethodTransformers(classNode);
         new ConstructorGenerator().process(classNode);
-        defineExtendedParserClass(classNode);
+        defineExtendedParserClass(classNode, parserClass);
         return classNode;
     }

@@ -93,7 +93,7 @@ public class ParserTransformer {
         );
     }

-    private static void defineExtendedParserClass(final ParserClassNode classNode) {
+    private static void defineExtendedParserClass(final ParserClassNode classNode, Class<?> origClass) {
         ClassWriter classWriter = new ClassWriter(ASMSettings.FRAMES) {
             @Override
             protected ClassLoader getClassLoader() {
@@ -105,7 +105,8 @@ public class ParserTransformer {
         classNode.setExtendedClass(loadClass(
                 classNode.name.replace('/', '.'),
                 classNode.getClassCode(),
-                classNode.getParentClass().getClassLoader()
+                classNode.getParentClass().getClassLoader(),
+                origClass
         ));
     }

And following methods need to be defined in all parsers:

@BuildParseTree
class TestParser extends BaseParser<Integer> {

    public static Class<?> findLoadedClass(String className) throws IllegalAccessException {
        try {
            return MethodHandles.lookup().findClass(className);
        } catch (ClassNotFoundException e) {
            return null;
        }
    }

    public static Class<?> loadClass(byte[] code) throws IllegalAccessException {
        return MethodHandles.lookup().defineClass(code);
    }

}

While this is far from optimal, it works :man_shrugging:

SethTisue commented 3 years ago

--illegal-access=permit will work as workaround for Java 16, however it will be removed in Java 17 (which is the upcoming LTS version

But if I understand correctly, --add-opens can be used instead?

ben5311 commented 3 years ago

@lukasraska Can you open a pull request for your proposal and send it to @sirthias ? Looks like a reasonable quick fix.

kenwenzel commented 3 years ago

@lukasraska @benjo053

Maybe it is also enough to create the MethodHandles.Lookup within the parser class

@BuildParseTree
class TestParser extends BaseParser<Integer> {

    public static MethodHandles.Lookup lookup() {
        return MethodHandles.lookup();
    }
}

and then use this lookup instance for the findClass and defineClass calls.

zhong-j-yu commented 3 years ago

(shameless plug) If anybody is going to use Java 17 (soon), take a look at Rekex (which doesn't work in Java 16 or lower:)

david-shao commented 3 years ago

Bumping this. @lukasraska Would love it if you could create a PR and send it to @sirthias ?

kenwenzel commented 3 years ago

I've just stumbled across the class definers in Guice that rely on sun.misc.Unsafe which is also available in Java 16+. Maybe such an approach could help us until the parser classes expose their own Lookup instances.

https://github.com/google/guice/blob/cf759d44c78e8490e3d54df6a27918e0811bbdf9/core/src/com/google/inject/internal/aop/HiddenClassDefiner.java

lukehutch commented 3 years ago

Since this problem also affected ClassGraph, a widely-used library I maintain, I had to find a solution to this.

I released the following JVM library for circumventing encapsulation, access controls, and security manager limitations on JDK 7-18ea+, using JNI (i.e. it includes a native code library, built for the major x86/x64 platforms):

https://github.com/toolfactory/narcissus

A collaborator, @burningwave (Roberto Gentili) build the following incredibly clever library for achieving the same thing, using Java-native mechanisms for circumventing all security (no native code, at least for the DefaultDriver, which works on JDK 7-15, whereas JDK 16-18ea require native code too):

https://github.com/toolfactory/jvm-driver

If you want to be able to choose between the two, you could use the following code from ClassGraph to allow loading either of these libraries as an optional runtime dependency:

https://github.com/classgraph/classgraph/tree/latest/src/main/java/nonapi/io/github/classgraph/reflection

I'm not sure if anyone is motivated enough to port Parboiled to this (I don't have the bandwidth), but this might be a solution.

lukasraska commented 3 years ago

@david-shao I can create PR that would force using the MethodHandles.lookup() instead.

As I'm switching to Java 17 on my projects as well, I've tested whether --add-opens still works for Java internal classes and I can confirm using --add-opens java.base/java.lang=ALL-UNNAMED still works for Parboiled, as @SethTisue suggested. So when you specify it for the runtime, everything still works as it should.

pcantrell commented 3 years ago

Adding runtime flags does create deployment & project sharing headaches, so a long-term solution hopefully won’t require them.

bgalek commented 2 years ago

@sirthias any progress on this issue? :)

bgalek commented 2 years ago

@lukasraska need any help with PR? :)

sirthias commented 2 years ago

@bgalek I'd be happy to look at and merge a PR that helps with overcoming the long-standing issues around dynamic class loading. But I'm afraid I'm too far away from the technical details (> 10 years!) to be of any help myself, unfortunately...

bgalek commented 2 years ago

@sirthias I'll try to make a PR then, I'll get back to you if I'll manage to do it!

bgalek commented 2 years ago

@sirthias @lukasraska https://github.com/sirthias/parboiled/pull/184 please check it out ;)

binkley commented 2 years ago

I have the same issue with parboiled "1.3.2-jdk17" after updating from JDK11 to JDK17

java.lang.RuntimeException: Error creating extended parser class: Could not determine whether class 'hm.binkley.dice.DiceParser$$parboiled' has already been loaded
sirthias commented 2 years ago

Just released parboiled 1.4.0 which should work fine on Java 17.

bgalek commented 2 years ago

@sirthias unfortunately this fix isn't working see latest comments https://github.com/sirthias/parboiled/pull/192

sirthias commented 2 years ago

Yes, I saw that. Bummer. But AFAICS at least adding the --add-opens java.base/java.lang=ALL-UNNAMED runtime param solves the problem, right?

bgalek commented 2 years ago

@sirthias yes, but this means any client of parboiled do need to set it in it's runtime - in my case it's like my own parser, and each client that uses it ;(

pcantrell commented 2 years ago

In my case, this causes headaches for a small armada of students each semester: I have a homework assignment which uses parboiled as a library dependency, and I have to help them make the config work…

Those different Java environment in the first list all have different ideas about where JVM options get specified, and whether they should propagate to (for example) an individual test run for the first time. This greatly complicates the project structure, requiring me to publish and maintain multiple IDE-specific configs.

If, however, parboiled did not require any JVM options, then both IntelliJ and VS Code would correctly pick up the project structure from the build.gradle — and that is thus all I would have to publish. The parboiled homework is uniquely complicated among all the many Java-based homeworks I assign.

If there’s some way to remove the need for JVM options entirely, that sure would be helpful in my case!

pcantrell commented 2 years ago

P.S. Here’s the homework in question, if anyone is curious: https://github.com/mac-comp381/wordy Parboiled does work quite nicely! Here it is in action: https://github.com/mac-comp381/wordy/blob/main/src/wordy/parser/WordyParser.java

sirthias commented 2 years ago

Thank you for the pointer, @pcantrell, it's great to see that the project is still adding value somewhere! :) Of course, it's very unfortunate, that things have been breaking so badly with the recent java releases. When parboiled was started (12 years ago!) I wasn't seeing any risks in the approach.

I'm afraid it isn't possible to fix things for newer Java versions in a way that no runtime options are required. Unless major work in invested to completely overhaul the whole thing.

To me it looks like parboiled is coming to the end of its natural lifecycle as a library. It probably makes sense for you to look for a replacement. Unfortunately.

pcantrell commented 2 years ago

Too bad! It’s been a good companion on this assignment through several iterations of the class, going back to 2018.

Are there alternatives to recommend? Parboiled still looks pretty good compared to other PEG options I’ve poked at….

kenwenzel commented 2 years ago

@sirthias Why don't you officially switch to the approach in this PR: https://github.com/sirthias/parboiled/pull/184

I think the MethodHandles approach is the way to go and it should also work in future Java versions.

sirthias commented 2 years ago

Two reasons:

  1. I don't have the capacity to invest any time into parboiled.
  2. IMHO the approach in that PR - while functional - really is an ugly hack. I couldn't really promote that as an official solution. But, of course, it might be viable for a legacy project that really just needs some solution.
kenwenzel commented 2 years ago

OK, I understand this.

Do you think the already mentioned solution used by Guice is less ugly: https://github.com/google/guice/blob/cf759d44c78e8490e3d54df6a27918e0811bbdf9/core/src/com/google/inject/internal/aop/HiddenClassDefiner.java

The downside is that it relies on sun.misc.Unsafe...

pcantrell commented 2 years ago

Remember the old engineering saying: “All it’s gotta do is work!”

I would be willing to poke at these options a bit over the next few months, see if there’s a way to make them more maintainable / less burdensome for client parsers / more resilient to future JVM changes, generate more helpful error messages for library clients, and at the very least publish the result as a fork.

kenwenzel commented 2 years ago

That sounds good :-)

I think the cleanest option would be to require the parser class to implement a static method as follows:

@BuildParseTree
class TestParser extends BaseParser<Integer> {

    public static MethodHandles.Lookup lookup() {
        return MethodHandles.lookup();
    }
}

Then Parboiled can just implement the methods from this PR https://github.com/sirthias/parboiled/pull/184 on top of the returned instance.

If Parboiled detects that the lookup() method is missing from the parser class it just throws an exception with some explanation.

bgalek commented 2 years ago

@kenwenzel this solution works partially (there are some edge cases now), but I think your idea is still an option (despite breaking the API a bit)

lukehutch commented 2 years ago

OK, I understand this.

Do you think the already mentioned solution used by Guice is less ugly: https://github.com/google/guice/blob/cf759d44c78e8490e3d54df6a27918e0811bbdf9/core/src/com/google/inject/internal/aop/HiddenClassDefiner.java

The downside is that it relies on sun.misc.Unsafe...

sun.misc.Unsafe continues to be further neutered with every new JDK release... eventually this will stop working, potentially in the next JDK release.

JNI is still an option for the foreseeable future at least. https://github.com/sirthias/parboiled/issues/175#issuecomment-938166492 But it's not ideal to have a native code library required in a project. (It's precompiled for you though, for the 4 major platforms...)

bgalek commented 2 years ago

Also, JDK 17 is LTS one, if it still supports unsafe, we could use it and at least that could buy us some time

sirthias commented 2 years ago

Ok, @kenwenzel contributed a patch moving to class loading to MethodHandles.Lookup for definition of new classes (https://github.com/sirthias/parboiled/pull/195), which I just released as version 1.4.1 to sonatype. If anyone wants to give it a spin, you are welcome to report how it works for you...

bgalek commented 2 years ago

@sirthias @kenwenzel simple tests look fine! I'll check it thoroughly next week!