jakartaee / expression-language

Jakarta Expression Language
https://eclipse.org/ee4j/el
Other
70 stars 50 forks source link

BeanELResolver cache causes Class Loader leaks in app servers #214

Closed lprimak closed 2 months ago

lprimak commented 5 months ago

Currently, BeanELResolver caches Class elements in its SoftReference maps. This causes it's ClassLoader to be referenced in the cache. BeanELResolver internal SoftReference map only gets cleaned up via cleanup() method, which doesn't necessarily get called often enough to prevent Out-Of-Memory errors. The other issue is that, even when cleanup() is called enough to prevent OOMs, the 'danglng' ClassLoaders cause VM to run at it's max memory capacity, causing unnecessary GC and memory allocation thrashing.

clearProperties() method is needed to clear the cache from the ClassLoaders and thus prevent the leak

lprimak commented 3 months ago

There is no way that I can think of to provide "automatic" scoping / cache invalidation. Either SoftReference will use maximum amount of VM memory causing memory and GC thrashing, or WeakReference will clear the cache with every GC.

My conclusion is to keep SoftReference and add clearProperties() method as in #215

lprimak commented 3 months ago

@arjantijms My assumption is that this is an issue in GlassFish as well? @starksm64 @bstansberry I see that IBM / RH use a fork of EL because of this issue?

I am sure that you guys would like to see a fix so you can start using "stock" version of EL at some point instead of needing to fork?

Thank you

bstansberry commented 3 months ago

@lprimak WildFly and JBoss EAP have long used our own variant of the EL API. We could likely stop doing that if our bean properties and factory finder caching use cases discussed in #218 were both handled in this project.

That said, it sounds like the need for the bean properties pieces is because Faces is holding a static ref to a BeanELResolver.

The WildFly project and the downstream Red Hat JBoss EAP product are produced independently of other EE implementations from IBM. So my comment here should not be interpreted as saying anything about IBM or any of its EE impls like Websphere, Liberty etc.

lprimak commented 3 months ago

Thanks, Brian,

It sounds like BeanPropertiesCache is a non-starter because of java.beans dependency. However, once clearProperties() is added (#215 is merged) that should solve the BeanPropertiesCache case.

Also sounds like FactoryFinderCache can be merged into EL as is.

Am I understanding things correctly?

OndroMih commented 2 months ago

I don't think that adding method to clear the cache in the API is a good idea. Actually, I don't even think that having a caching mechanism in the API is a good idea. It would be better to introduce an SPI to plug in a caching mechanism. Then users could provide a mechanism which could have have a clearProperties() method or any other method, and wouldn't spoil the API.

@lprimak, @bstansberry, @markt-asf, @arjantijms, What do you think about providing an SPI to plug in a caching mechanism and remove the current caching mechanism from BeanELResolver? It should also address https://github.com/jakartaee/expression-language/issues/218 and allow vendors to provide their own caching for BeanPropertiesCache + FactoryFinderCache.

lprimak commented 2 months ago

215 is a pragmatic way of fixing the problem as it exists today. Since we already have a cache in the API today.

When the caching is removed, the clearProperties() method will be removed with it.

Mojarra may not be the only module where this issue exists. My proposal fixes all instances of this problem, without having to wait and integrate for new version of Mojarra, or any other module that has this issue. Nor does this fix rely on finding all the leaks that potentially exist because of this cache.

It's 5 lines of code, IMHO really not worth dissecting this much.

What do you think about providing an SPI to plug in a caching mechanism and remove the current caching mechanism

Fantastic idea, very well inline with Jakarta EE way of doing things

OndroMih commented 2 months ago

It's 5 lines of code, IMHO really not worth dissecting this much.

I think these 5 lines of code would require a minor release of the EL spec. That might take some time, even longer time than fixing Mojarra and wait for the next release of Mojarra. And, in order to pass the signature tests for Jakarta EE 11, I doubt that application servers would adopt it before EE 12, which will be released even much later. In my opinion, "Pragmatic" fixes like this don't belong to Jakarta EE specs. Then it's better to for the API and implement the fix there.

I think that, if you want a quick pragmatic fix, fixing Mojarra is the way. Here is a fix from me: https://github.com/eclipse-ee4j/mojarra/pull/5462

lprimak commented 2 months ago

Ok, Ondro, you've convinced me. I didn't fully realize I was working on a public API since it looks so much like implementation. You are right that change in API would trigger a whole bunch of issues that are too broad for this "pragmatic" fix.

How can we put separation of implementation details (including caching) from EL API into Implementation / SPI on a critical path for EE 12?

markt-asf commented 2 months ago

Replaced by #248