Closed snazy closed 1 week ago
So, good news first: there are
But there are ThreadLocal
s leaking, which are loaded by the Quarkus test class-loaders, which then cause "all the classes" and resources to leak. Some classes that add ThreadLocal
s are the following. There are more - but the ones that are initialized with a Supplier
seem to be the problematic ones.
io.netty.util.internal.InternalThreadLocalMap
(cannot modify)org.projectnessie.cel.parser.Unescape
org.apache.avro.Schema
(cannot modify)Discussion on Quarkus Zulip
The leak's caused by ThreadLocal
s attached by 3rd party libraries (Netty, Avro and some more) to the shared JUnit Test worker
thread.
In JUnit, tests are run from the single Test worker
thread (or a poll of threads, if JUnit is configured to do so). The QuarkusTestExtensions
changes the thread's context-class-loader to use the correct Quarkus class loader (QuarkusClassLoader:Quarkus Base Runtime ClassLoader: TEST for ...
).
If a test somehow "touches" a dependency that creates a thread local, that thread local is attached to the Test worker
thread. Types in such a thread local reference the Quarkus test class loader.
Even though everything else is properly cleaned up before the next test w/ a different Quarkus test class loader starts, the thread locals are still referenced by the Test worker
thread, which means that the objects/suppliers in those thread locals are still referenced, in turn the class loader and with that the whole set of classes. The result is eventually an OOM.
Here's an output that (should) illustrate the issue. It's generated by some hacky code that inspects the Test worker
thread's threadlocals
map.
THREAD LOCALS OF Thread[#1,Test worker,5,main]
jdk.internal.loader.ClassLoaders$PlatformClassLoader@42f5e9a6
jdk.nio.zipfs.ZipFileSystem$IndexNode /org/assertj/core/error/UnsatisfiedRequirement.class , index: -1
QuarkusClassLoader:Quarkus Base Runtime ClassLoader: TEST for TestMetricsDisabled [V1] (QuarkusTest)@69b07727
org.apache.avro.NameValidator$2 org.apache.avro.NameValidator$2@c8f14da
org.apache.avro.Schema$$Lambda/0x00007feccca70000 org.apache.avro.Schema$$Lambda/0x00007feccca70000@5082ae8
org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccce6af98 org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccce6af98@2f992272
QuarkusClassLoader:Quarkus Base Runtime ClassLoader: TEST for TestMetricsTags [V1] (QuarkusTest)@5cf93ce8
io.netty.util.internal.InternalThreadLocalMap io.netty.util.internal.InternalThreadLocalMap@45d1828
org.apache.avro.NameValidator$2 org.apache.avro.NameValidator$2@eecf044
org.apache.avro.Schema$$Lambda/0x00007feccd87fd10 org.apache.avro.Schema$$Lambda/0x00007feccd87fd10@69e3d304
org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccdc9a5a0 org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccdc9a5a0@3dd96fb3
jdk.internal.loader.ClassLoaders$AppClassLoader@76ed5528
org.jboss.logmanager.ThreadLocalNDC$Stack
QuarkusClassLoader:Quarkus Base Runtime ClassLoader: TEST for TestQuarkusRestInMemoryPersist [V1] (QuarkusTest)@681e913c
org.apache.avro.NameValidator$2 org.apache.avro.NameValidator$2@69d2fc03
org.apache.avro.Schema$$Lambda/0x00007feccb96ec18 org.apache.avro.Schema$$Lambda/0x00007feccb96ec18@423a5c9c
org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccbe30fa8 org.projectnessie.cel.parser.Unescape$$Lambda/0x00007feccbe30fa8@2ef77fcc
Just documenting the extremely hacky code used to identify the issue...
public class TempBase {
@BeforeAll
public static void foo() throws Exception {
System.out.println();
System.out.println();
System.out.println("*** START UP");
System.out.println();
System.out.println();
dump();
}
@AfterAll
public static void cleanup() throws Exception {
System.out.println();
System.out.println();
System.out.println("*** CLEAN UP");
System.out.println();
System.out.println();
dump();
}
private static void dump() throws Exception {
Map<ClassLoader, List<Thread>> threadsByClassLoader = new IdentityHashMap<>();
Thread t1 = null;
for (Thread thread : Thread.getAllStackTraces().keySet()) {
if (thread.threadId() == 1) {
t1 = thread;
}
threadsByClassLoader
.computeIfAbsent(thread.getContextClassLoader(), cl -> new ArrayList<>())
.add(thread);
}
threadsByClassLoader.forEach(
(cl, t) -> {
if (cl == null) {
return;
}
System.out.println(cl);
t.sort(Comparator.comparing(Thread::getName).thenComparing(Thread::getId));
for (Thread thread : t) {
System.out.println(" " + thread.threadId() + " " + thread.getName());
}
});
if (t1 != null) {
dumpThreadLocals(t1);
}
Thread t = Thread.currentThread();
if (t != t1) {
dumpThreadLocals(t);
}
}
private static void dumpThreadLocals(Thread t)
throws NoSuchFieldException, IllegalAccessException {
Map<ClassLoader, List<Object>> threadLocalsByClassLoader = new IdentityHashMap<>();
Field threadLocalsField = Thread.class.getDeclaredField("threadLocals");
threadLocalsField.setAccessible(true);
Object threadLocals = threadLocalsField.get(t);
Field tableField = threadLocals.getClass().getDeclaredField("table");
tableField.setAccessible(true);
Object[] entries = (Object[]) tableField.get(threadLocals);
for (Object entry : entries) {
if (entry != null) {
Field valueField = entry.getClass().getDeclaredField("value");
valueField.setAccessible(true);
Object value = valueField.get(entry);
if (value != null) {
threadLocalsByClassLoader
.computeIfAbsent(value.getClass().getClassLoader(), cl -> new ArrayList<>())
.add(value);
Field referentField = Reference.class.getDeclaredField("referent");
referentField.setAccessible(true);
Object referent = referentField.get(entry);
if (referent != null
&& referent
.getClass()
.getName()
.equals("java.lang.ThreadLocal$SuppliedThreadLocal")) {
Field supplierField = referent.getClass().getDeclaredField("supplier");
supplierField.setAccessible(true);
Object supplier = supplierField.get(referent);
if (supplier != null) {
threadLocalsByClassLoader
.computeIfAbsent(supplier.getClass().getClassLoader(), cl -> new ArrayList<>())
.add(supplier);
}
}
}
}
}
System.out.println("THREAD LOCALS OF " + t);
threadLocalsByClassLoader.forEach(
(cl, objs) -> {
if (cl == null) {
return;
}
System.out.println(cl);
objs.sort(
Comparator.comparing(o -> o.getClass().getName())
.thenComparing(System::identityHashCode));
for (Object obj : objs) {
System.out.println(" " + obj.getClass().getName() + " " + obj);
}
});
}
}
Related JUnit issue
Added a commit to #9433 to increase the max test-worker heap size to 3g. This makes the
:nessie-quarkus:test
task finish.There is a resource leak, but that's already present with earlier Quarkus versions, so I suspect something that's present in all our tests is causing the leak, #9440 is one cause, but not the only one.
This issue is to investigate, find the root cause, fix it and bring the max test-worker heap size back to 2g.