utopia-rise / godot-kotlin-jvm

Godot Kotlin JVM Module
MIT License
560 stars 38 forks source link

Proposal: Change the way the MemoryManager runs. #618

Open CedNaru opened 2 months ago

CedNaru commented 2 months ago

Reminder: The MemoryManager is our central database that holds a weak reference to all Godot related object created on the JVM side, being core types, wrappers for native types or even scripts. When created, all those objects are registered to a ReferenceQueue that allows us to know when they have been collected by the GC. This callback is necessary so we can decrement the counter of RefCounted objects on the Godot side. That operation is currently done in a separate thread.

Whenever that thread is woken up by the OS's scheduler, it's checking the queue to see if there is any RefCounted instance to decrement. This design has 3 consequences:

My proposal holds in 2 points:

All in all, I think the proposal will simplify the memory management a bit and make it way more resilient.

CedNaru commented 1 month ago

Follow up after discovering new issues with our current design and profiling JNI overhead more extensively. The first sad truth is that I couldn't find a design that could totally got rid of the pingpong between strong and weak references. I can remove it from the base godot wrapper types, but not from JVM scripts. Its frequency can be reduced as well, but it will have to remain in some shape. The reason for that is why we have the pingpong in the first place. The C++ code needs a reference to the JVM Script so it can make calls to it (thought the common lifecycle callbacks like _process()). It means that as long as native C++ instance is alive, the script must remain alive too hence why we need a strong reference to it. The issue starts with the JVM side when it comes to managing refcounted instances. A JVM instance of a RefCounted acts exactly like a Ref in Godot C++ and increment its counter. It results in a cyclical reference, Godot keeps the JVM instance alive and the JVM instance keeps the C++ Refcounted alive.

I don't have this issues with regular scriptless instances, or even with scripts Object. This issue only exists for RefCounted instances with a JVM script. So far this model applied equally to wrappers and scripts, I can change it to be only for scripts. Now how to deal with that cycle ? We break by turning a strong reference into a weak one, which can only be the one used by the C++ binding. Of course, we can't just allow this reference to be weak all the time. If it was the case, then whenever a script is only used by Godot but not store directly by the Kotlin/Java code, it would be garbage collected. It means that as long as the C++ side is using the script, we can't turn it into a weak reference. So how do we know when the C++ is no longer using it ? With its counter value. When it reaches 1, we know that the only remaining reference has to be on the JVM side, it's then safe to switch to a weak reference, and let the script be naturally GCed once it's no longer used on the JVM side.

So far so good, it's a relatively simple task to handle, just use the refcounted callbacks Godot already provide and do that switch whenever the counter reaches that value of 1. Except we recently discovered a nasty side effect of that design, some cases create a high frequency pingpong. The one example I have in hand is when you use a Godot resource only referenced by the JVM side. In that situation, the C++ side only hold a weak reference to it. Now a Godot resource needs to be used somehow at some point, mostly by passing it as a parameter to a call to the C++ API and that when we kill our performances.

Let's take that bit of code from our most efficient bunnymark benchmark:

    @RegisterFunction
    override fun _draw() {
        for (bunny in bunnies) {
            drawTexture(bunnyTexture, bunny.position)
        }
    }

Here we use a basic CanvasItem::drawTexture call to draw all our bunnies on screen using a texture. This texture is a Refcounted and only referenced in our script. Now each iteration of that loop is going to send the texture to C++. Because C++ gets a reference to the texture, the counter will be incremented and a switch to a strong reference happens. Once the call is done, Godot has no more use for the texture and so doesn't store it anywhere, the counter is decremented, back to 1 and a switch to a weak reference happens. "Switching" the reference implies creating a new one of the correct type and deleting the previous one. So for each call to drawTexture(), here what happens:

Every single call to the Godot API is causing 4 calls to the JNI API, which as you probably know are expensive. How much expensive ? Well it may be a bit unfair to use a benchmark made only for the sake of stressing this particular usecase, but the numbers are quite telling nonetheless. If I choose to run the benchmark with this pingpong behavior disabled (and so creating a potential memory leak), the final score on my computer goes from barely 35k to 80k+. It's more than twice the performances.

As I explain earlier, I couldn't think of a way to get rid of the pingpong. At least not one that wouldn't require silly things like asking Kotlin users to do like in C++ and wrap all their RefCounted instances inside a Ref<> wrapper, which is a huge no for me when we are on a JVM with a GC, it's too big of an antipattern for me to accept this kind of solution.

Now I think I can still create a design that would also to make the pingpong frequency so slow that it wouldn't matter any more. The current design is to promote/demote the JNI reference at the same time as the counter increments/decrements. We need to break this model. My suggestion is to "delay" those switches as long as possible.

First, we need to identify when and how those switches happen. Fortunately, we have very little of them:

Summary: The pingpong is here to stay but its effects lessen a lot because it's only necessary for Refcounted with a JVM script. And out of 3 situations that requires a switch, 2 can be delayed and even cancelled.

CedNaru commented 1 month ago

Part 2 of the follow-up focus on the implementation details. We could implement the suggestions I mentionned above inside the current system. But this would be adding extra complexity on top of something already quite hard to understand. I have been the one designing most of the memory management for that project and still get confused at time too. We already went through 3 versions of the MemoryManager, but I think a 4th might be necessary? It'd be better to come with a new model that can naturally deal with those new requirements, instead of having to force them in. Now that the C++ code is much cleaner and JvmWrapper easier to use, we can afford to do something we couldn't before: To move some of the MemoryManager logic to the C++ side.

If our main issue is crossing the boundary too often, then we have to adapt the logic so it happens the least possible. Like stated previously, our main current issue is the expensive cost of the pingpong between weak and strong references. The reason it's expensive is that it requires JNI calls. If we can't rid of it, then why not trying to move the logic to the JVM side ? The only reason those JNI references exist is to keep JVM instances alive. As long as we have a way to keep instances alive, it actually doesn't matter if on which side it's done. It means we need some way on the JVM side to keep strong references to wrappers and scripts. The thing is that we already got something similar in the MemoryManager. Before continuing, let's enumerate the current responsibilities of the MemoryManager regarding Godot native objects (we omit core types):

Instead of relying on a strong JNI reference to keep a JVM instance, what if we were just using this giant database as a way to keep a strong reference ? We wouldn't even need to directly switch them with a weak reference when we want to break the cycle; we can simply remove them for the DB itself. You may wonder: "If we remove a JVM instance from the DB, how do we ever get it back using its ID?". The answer is simple, we don't need it. Think about it. We switch to a weak reference whenever the C++ doesn't use this instance anymore. So how would C++ even query a JVM instance from an ID it can't have in the first place ? With that in mind, we can transform the ping-pong into a simple add to/remove from the DB.

The new behaviour would be the following:

This new design also makes a serial Memory Manager easier to implement. We no longer need the Memory Manager to send its JVM instance to C++ to create a binding. Instead, the C++ side accumulates over a whole frame objects that needs to have their status changed. Then a call from C++ to the JVM is made sending the list of instances that have to be removed from the DB and returns the list of RefCounted instances whose counted must be decremented. The consequence is that, at the cost of delaying memory operations by one frame, we can properly sync Godot and JVM in a single JNI call per frame, the rest being handle by simple containers operating only in their own language.

I think all our needs are covered by that design. I hope I don't forget some sneaky details that would invalid the idea.