JetBrains / JetBrainsRuntime

Runtime environment based on OpenJDK for running IntelliJ Platform-based products on Windows, macOS, and Linux
GNU General Public License v2.0
1.35k stars 198 forks source link

JBR-7579 Fix SurfaceManager.cacheMap retaining strong references. #453

Closed YaaZ closed 1 month ago

YaaZ commented 2 months ago

I'm not sure that having value as a WeakReference is the right solution. However in our case, leaked MTLGraphicsContext is reachable from both key and value, like this: key -> MTLGraphicsConfig value -> MTLSurfaceDataProxy -> MTLGraphicsConfig But I don't think keeping MTLGraphicsConfig as a weak ref inside MTLSurfaceDataProxy is more correct either.

When I come with this patch to OpenJDK, they will ask me: what is the point of this cache, when its values must be duplicated elsewhere in order to not be collected?

So the ideal solution I would want is something like this:

But I cannot currently figure out a way to do this.

bourgesl commented 2 months ago

Looks like a great puzzle on java references... Could you elaborate who is responsible to keep alive this references ? Why not both weak: weakhashmap + weak value if other objects retain strong refs... I can maybe help here...

YaaZ commented 2 months ago

So originally GraphicsConfigs were getting stuck in this cache, preventing them from being destroyed.

GraphicsConfigs are usually used as keys in this map, so the first thing we need to do is to make the map a WeakHashMap. But this alone is not enough, because as values we store SurfaceDataProxies, which in turn have a reference to the same GraphicsConfig, thus retaining a strong reference to it. There's even a note about this in the documentation of WeakHashMap:

Implementation note: The value objects in a WeakHashMap are held by ordinary strong references. Thus care should be taken to ensure that value objects do not strongly refer to their own keys, either directly or indirectly, since that will prevent the keys from being discarded. Note that a value object may refer indirectly to its key via the WeakHashMap itself; that is, a value object may strongly refer to some other key object whose associated value object, in turn, strongly refers to the key of the first value object. If the values in the map do not rely on the map holding strong references to them, one way to deal with this is to wrap values themselves within WeakReferences before inserting, as in: m.put(key, new WeakReference(value)), and then unwrapping upon each get.

So I wrapped values into WeakRefs. However now this means that when device is stll alive, but there are no more strong refs to the SurfaceDataProxy, it will be GCed, which may be undesirable (?).

So the main question is: is there a code, which relies on this cache keeping SurfaceDataProxies alive, or are they already strongly referenced as long as they needed and there is nothing to worry? Does this generally apply to any kind of data that can be stored in that cache? The only use case I could find is that GraphicsConfig + SurfaceDataProxy one, but I may have missed something.

I will investigate it a bit more and post an update.

YaaZ commented 2 months ago

Even if we try to keep GraphicsConfig as a weak ref inside all implementations of SurfaceDataProxy, it will still be retained via SurfaceDataProxy.cachedSD. We cannot make it weak as well, because it basically "owns" the SurfaceData and should not be collected as long as SurfaceDataProxy is reachable. Descending deeper into SurfaceData and getting rid of strong refs to GraphicsConfiguration there is just too much of refactoring, and too many places where something can go wrong.

I would say, we have stumbled upon limitations of WeakHashMap implementation and it doesn't seem currently possible to implement it "the right way", that is, retain value as long as key is reachable, but do not let it prevent the key from being collected. This would require some GC magic to collect the whole node at once, like making value a weak ref, but adding a virtual "key->value" edge into reachability graph.

But as a workaround, key can itself serve as a node - "physically" contain corresponding values. This way strong refs from the value would not prevent the key from being collected. I will investigate this approach.

bourgesl commented 2 months ago

Reference can have their own ReferenceCleaner... so you could handle objects when the reference becomes weakly reachable. I used special Cleaner in marlin to free reliabilly off-heap resources, see: https://github.com/bourgesl/marlin-renderer/blob/jdk/src/main/java/sun/java2d/marlin/OffHeapArray.java

YaaZ commented 2 months ago

But Cleaner wouldn't help, because GraphicsConfiguration would never become weakly reachable due to the way this cache is organized.

bourgesl commented 2 months ago

Maybe you could introduce a special object, like GraphicsConfigRef ou SurfaceRef that could be handled by special cleaner... the big question here is: who (which object) is supposed to make (keep) surface & graphicsConfig alive...

YaaZ commented 2 months ago

Now I'm pretty sure this is the best solution.

  1. I moved proxy cache into a separate object. Now caching level is determined by placement of this cache object instead of a "proxy key". This cache "owns" proxies and therefore strong refs via them do not prevent the cache (and hence GraphicsConfig) from being detected as weakly reachable.
  2. I removed a global JNI ref to the GraphicsConfig from native MTLSurfaceData, it was not needed there. This code was blindly copied from the OGL implementation and therefore this bug can affect OGL too, I will investigate it later.