google / guice

Guice (pronounced 'juice') is a lightweight dependency injection framework for Java 11 and above, brought to you by Google.
https://github.com/google/guice
Apache License 2.0
12.48k stars 1.67k forks source link

Possible leak using Guice #1755

Closed lucianoRM closed 1 year ago

lucianoRM commented 1 year ago

Hello! We are using Guice as a DI Framework for a backend microservice. Recently I've been tracking the status of our service and it looks like we have some kind of classloading leak (Metaspace keeps increasing).

HOW WE ARE USING IT Our service is in charge of running some process from within a provided jar. That process levarages Guice so that required dependencies are injected. To do so, we create a new ClassLoader (ChildFirstClassLoader) from every new process to be executed. The classloader hierarchy ends up being:

Basically, every class in the loaded jar, is loaded by the ChildFirstClassLoader (once for every new jar), while all Guice's classes are loaded by the ApplicationClassLoader, only once.

WHAT I AM SEEING

I took a heapdump of the JVM and, indeed, I am seeing a lot of instances of our ChildFirstClassLoader with strong references, even after those processes have ended. If I check the paths to GC roots, I see a lot of the following: image

So, If I am reading the information properly, everytime BytecodeGen.fastClass() is called, that ends up storing a reference to an anonymous class: com.google.inject.internal.aop.AbstractGlueGenerator$$Lambda$2202+0x00000008021e6278, that internally, holds a reference to my class (or a proxy created by guice), which was loaded by the ChildFirstClassLoader.

Looks like the anonymous class from com.google.inject.internal.aop.AbstractGlueGenerator is a Function<> returned from the method: bindSignaturesToInvokers(). I think the call stack is something like:

Can you help me understand if I am reading this correctly? What could be a possible solution to this?

Let me know if you need more information or if you find any errors in my analysis.

Just in case it's relevant, we are using Guice version 5.1.0

Thanks!

mcculls commented 1 year ago

The generated glue held inside that ClassValue should be unloaded when the host class (ie. the application class) is unloaded, which is when its class-loader is able to be unloaded. I can double-check with a simple test app, but it sounds like something else is keeping your child class-loaders alive, which then keeps those generated fast-classes alive.

Is your code publicly available?

Note if you don't need bytecode generation then you can turn it off with:

-Dguice_bytecode_gen_option=DISABLED

but that would also mean you couldn't do method interception.

lucianoRM commented 1 year ago

Sorry, the code is not public.

The only strong references I am seeing are from Guice. But, maybe once the weak/soft references are cleaned, since Guice's references will be the only ones, they will be cleaned as well? I could check that.

In which scenario would we be using method interception?

We only have a bunch of Modules either binding Interfaces to Implementations (in the configure() method) or defining a @ Provides method.

As for injection, we do field and constructor injection with @ Inject

mcculls commented 1 year ago

Part of the problem with unloading is that a lot is held back until the class-loader itself becomes unloadable, which does make it hard to debug - we do have tests in the codebase to verify proxy unloading, which uses a similar mechanism, and those are currently passing.

Regarding method interception - that's if you are using the bindInterceptor method: https://github.com/google/guice/wiki/AOP

If you're not using that then try adding -Dguice_bytecode_gen_option=DISABLED as a JVM option to see if it helps.

lucianoRM commented 1 year ago

Awesome! I will try to figure out if there is anything else preventing the Class Loader to be unloaded.

In the meantime, I will try using that flag, I don't think I require method interception. Maybe I can isolate the root problem better.

Thanks for the quick response!

lucianoRM commented 1 year ago

Following up on this

Using -Dguice_bytecode_gen_option=DISABLED definitely worked. I no longer see those paths to GC roots. What's more, the service seems to be working just fine, so I guess we can execute without method interception.

However, based on the information you provided, even if using bytecode generation, those paths should be cleared. So, I will keep searching for another cause for this issue. I already have a lead with a lib we are using.

Closing this ticket for now.

Thanks a lot for the help!