Team-EnderIO / EnderIO

EnderIO Rewritten for Modern Minecraft.
The Unlicense
321 stars 92 forks source link

[1.20.1] Do not add duplicate invalidation listeners while caching machine neighbor capabilities #682

Closed rezonant closed 3 weeks ago

rezonant commented 3 weeks ago

Description

Fixes #680. Targets 1.20.1 specifically.

Machine block entities add invalidation listeners to discovered neighbor capabilities as they enter the block's capability cache for the purpose of ensuring the machine does not keep a capability around after it is removed from the neighbor block. However machines do not track whether they have already seen a given capability, and add a new invalidation listener each time the neighbor capability cache is rebuilt. Unfortunately this produces a memory leak that can, over time, cause millions of callbacks to be added to the LazyOptional<T> instance for a given capability. Over time the leak will cause the GC to thrash, causing sporadic TPS drops and/or client/server lockups.

Interestingly, Forge does not provide a removeListener() on LazyOptional<T>, meaning that technically speaking every single usage of addListener() across all mods is an unavoidable memory leak assuming the LazyOptional<T> is long lived, which is normal. Additionally, there may be issues with invalidation working properly for some types of capabilities based on my discussions with community members at NeoForge, but a full accounting of those limitations will require an additional investigation.

This fixes the issue by maintaining a WeakHashMap of the LazyOptional<T> instances obtained via the capabilities system, and only attaching an invalidation listener if the capability was not already present in the hash map. The benefits of using WeakHashMap here (even though the value of the hash map is irrelevant) are that there is no "culling" of stale references to manage, as that is handled automatically by the JVM.

Before this change, I was able to cause an -Xmx2G JVM to crash in an hour or so with a capacitor bank setup. Capacitor banks cause lots of cache invalidations as the energy they contain changes due to normal I/O, so the issue is particularly pronounced for them, even though this issue does affect all EnderIO machines. After this change, the same setup remained well under the 2GB limit for 24 hours, suggesting that the memory leak is solved.

It is certainly not ideal that energy updates cause the cache to be invalidated at all and improving that should certainly be investigated, but given that the capabilities system is overhauled in NeoForge 1.20.3 and up and EIO has 1.20.1 as a long term support branch, it seems prudent to fix this with the simplest possible fix and target further improvements (both to caching and to capacitor banks) in the current development branch (1.20.6).

Breaking Changes

None.

Checklist