eclipse-jdt / eclipse.jdt.core

Eclipse Public License 2.0
158 stars 126 forks source link

Class leak in org.eclipse.jdt.internal.compiler.util.CharDeduplication$CacheReference #2643

Open danshome opened 3 months ago

danshome commented 3 months ago

On redeploy of our applications, we are seeing the following class leak in org.eclipse.jdt.internal.compiler.util.CharDeduplication$CacheReference

Screenshot 2024-06-27 at 12 23 23 PM
danshome commented 3 months ago

This is the dependency information from our build...

[INFO] | | +- org.drools:drools-ecj:jar:9.44.0.Final:provided [INFO] | | | - org.eclipse.jdt:ecj:jar:3.33.0:provided

iloveeclipse commented 3 months ago

@danshome : is this a regression, and if yes, in which version it was working last?

@jukzi : could you please check, the original code seem to be coming from https://github.com/eclipse-jdt/eclipse.jdt.core/commit/9525f3eec57b403f918771025f4280bba56b1221

danshome commented 3 months ago

@iloveeclipse I honestly don't think I can say if this is a new issue, but we've been noticing this issue for quite some time. The only reason I found it was because we have been reviewing our own code to fix class leaks. I am unable to fix this one because the thread local is private...unless I were to attempt to fix it using reflection, which I usually only do as a last resort. And I'm not even sure if it's possible now that we are on JDK 21 because I don't think you can access privates anymore.

jukzi commented 3 months ago

That class does not leak. it holds references to last used tokens for later reuse on purpose. jdt always worked like that. its a trade off between memory and perfromance.

iloveeclipse commented 3 months ago

After redeployment (means: classloader change) the class should have been unloaded from memory, which is not the case because classloader can't be unloaded because of that perticular class. So it is a leak. Reopening.

danshome commented 3 months ago

@jukzi Are you saying we can disregard the increase in LoadedClassCount upon redeployment? Do you anticipate that the JVM will automatically clean up once the memory is full? Despite undeploying all applications and forcing a Full GC using gcRun and gcRunFinalization through the mbeans, MAT still indicates that classes are being leaked.

danshome commented 3 months ago

I'm not sure if this will work, but I'm testing this change locally to see if a Cleaner can be used to clean up the ThreadLocal reference automatically when the CharDeduplication instance is no longer in use.

Index: org.eclipse.jdt.core.compiler.batch/src/org/eclipse/jdt/internal/compiler/util/CharDeduplication.java
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/org.eclipse.jdt.core.compiler.batch/src/org/eclipse/jdt/internal/compiler/util/CharDeduplication.java b/org.eclipse.jdt.core.compiler.batch/src/org/eclipse/jdt/internal/compiler/util/CharDeduplication.java
--- a/org.eclipse.jdt.core.compiler.batch/src/org/eclipse/jdt/internal/compiler/util/CharDeduplication.java (revision 975e7c92120f1098992ebc5dc32a4ed41e522083)
+++ b/org.eclipse.jdt.core.compiler.batch/src/org/eclipse/jdt/internal/compiler/util/CharDeduplication.java (date 1719517115216)
@@ -15,11 +15,14 @@
  *******************************************************************************/
 package org.eclipse.jdt.internal.compiler.util;

+import java.lang.ref.Cleaner;
 import java.lang.ref.SoftReference;
 import java.util.Arrays;

 public class CharDeduplication {

+   private static final Cleaner cleaner = Cleaner.create();
+
    // ----- immutable static part (thread safe): ----

    private final static char[] CHAR_ARRAY0 = new char[0];
@@ -35,13 +38,18 @@
    public static final int SEARCH_SIZE = 8; // a power of 2, has to be smaller then TABLE_SIZE

    private final static ThreadLocal<SoftReference<CharDeduplication>> mutableCache = ThreadLocal
-           .withInitial(() -> new SoftReference<>(new CharDeduplication()));
+           .withInitial(() -> {
+               CharDeduplication instance = new CharDeduplication();
+               cleaner.register(instance, instance.new CleanupAction());
+               return new SoftReference<>(instance);
+           });

    /** @return an instance that is *not* thread safe. To be used in a single thread only. **/
    public static CharDeduplication getThreadLocalInstance() {
        CharDeduplication local = mutableCache.get().get();
        if (local == null) {
            local = new CharDeduplication();
+           cleaner.register(local, local.new CleanupAction());
            mutableCache.set(new SoftReference<>(local));
        }
        return local;
@@ -137,4 +145,11 @@
        }
        return true;
    }
+
+   private class CleanupAction implements Runnable {
+       @Override
+       public void run() {
+           mutableCache.remove();
+       }
+   }
 }
jukzi commented 3 months ago

the cache intentional survives references to CharDeduplication . it's a threadlocal: get rid of that thread to get rid of it.

iloveeclipse commented 3 months ago

get rid of that thread to get rid of it

Often the applications have no control on which threads particular code will run. Think about shared / common thread pools etc. On application redeployment the class reference leaks and holds entire classloader with all remaining classes / data loaded by classloader. So one should make sure the class reference is removed from the threadlocal / thread if not more needed.

danshome commented 3 months ago

I couldn't find any way to remove the threadlocal from our application because it's private final static ThreadLocal. I believe this must be handled within the JDT code itself.

jukzi commented 3 months ago

The current implementation should not be a problem. see https://stackoverflow.com/a/24862045/9549750 If you still think you have a real problem please provide a minimal viable reproducer.

danshome commented 3 months ago

@jukzi I will verify, but I'm quite certain that even if I undeploy all our applications, perform a full garbage collection, and then take a heap dump, it will still show a class leak without any apps deployed and the stack trace doesn't include any of our code.

danshome commented 3 months ago

@jukzi @iloveeclipse

After reading https://stackoverflow.com/a/24862045/9549750, I found that this is exactly what is happening, but it's in the CharDeduplication$CacheReference, which has a static threadlocal. We don't have any control over this since its in the JDT code. When the web application is shutdown, this is not getting released.

private final static ThreadLocal<SoftReference> mutableCache = ThreadLocal .withInitial(() -> new SoftReference<>(new CharDeduplication()));

jukzi commented 3 months ago

there is no CacheReference anymore. update your jdt

iloveeclipse commented 3 months ago

there is no CacheReference anymore. update your jdt

@danshome : what Jörg means: please check you use latest released version of ecj, which should be 3.38, you are running on 3.33. Still, the code uses static ThreadLocal so probably it will leak too. But it is better you could verify to avoid everyone wasting time here. Note, there will be no maintenance releases of 3.33 ecj anyway, so if bug is still there, it would be fixed only in the next regular ecj release.