Memory leaks in the TransferContext class

Right now we have the following code in TransferContext

const int MAX_STACK_SIZE = MAX_FUNCTION_ARG_COUNT * 8;

thread_local static Variant variant_args[MAX_STACK_SIZE];// NOLINT(cert-err58-cpp)
thread_local static const Variant* variant_args_ptr[MAX_STACK_SIZE];
thread_local static int stack_offset = -1;

The reason for that is that we need to store the variables somewhere once they got parsed from the shared buffer. Godot calls take an array of Variant pointers, so we use the second array to store the pointers as an argument. Now, the reason for the stack size is that a lot of JVM <=> C++ exchanges can happen in a single call. (Jvm calling cpp calling Jvm calling cpp caling etc...). Because the buffer is overwritten each time, we can't keep the data from the first layer of call in the shared buffer, hence the long stack to store them. We have this dynamic for every thread. This design has leaks because data are kept in those Variant arrays until overwritten. Let's say you make an icall with a lot of parameters (let's say 10), when most icalls are not going to go above 4. It means that the 6 Variants after that will remain in memory. That's not an issue when storing a value based Variant (like Vector2) but it is when storing RefCounted or even pointer based Variant like Dictionary (with their own atomic counters). They will never be freed as long as they are in the stack.

Even worse, they can be the cause of leaks when terminating the program, the TransferContext destructor looks like this:

TransferContext::~TransferContext() {
    for (auto& variant_arg : variant_args) {
        variant_arg = Variant();
    }
}

We free everything in the array but it can only happen for the Array specific to the main thread. All the arrays related to other threads will not be freed and cause memory leaks that the ObjectDB will report.

We have 2 ways to fix that:

Manually clear the whole array after every icall (performance cost because we have to erase previous values by replacing them with newly constructed Nil Variant, to be benchmarked).
Entirely rework the way we do icalls.

The first option is self-explanatory. The second is quite more complex. So far we have been using MethodBind::call() which require aVariant**. Godot has a second method MethodBind::ptrcall() which require a void**. This method is said to be more efficient according to the comments next to it. MethodBind being typed with templates, it can directly reinterpret the pointers to their correct type. Doing so would avoid using an intermediary Variant. So far the data conversions have been the following: SharedBuffer -> Core Type -> Variant -> MethodBind::call() -> Core type. By using MethodBind::ptrcall(); it could become: SharedBuffer -> Core Type -> MethodBind::call() -> No Conversion needed in the function.

We would still keep a stack like today, except it would not be Variant[] anymore but a generic typeless stack. The main difference would be that we will only store the value based core type in it. For the other pointer based types, we allocated a native memory slot when they were created on the JVM side already (which is something we also have to optimize with PoolAllocators) so we can use the same pointer for the MethodBind. The consequence would be to not keep those refcounted types in the stack, instead they would be managed by the Kotlin Memory Manager (like they already are). I won't lie, this rework would be complicated to implement and test.

But our 2 choices are either :

Simple but a bit less efficient than today
Complex but more efficient than today.

The gain and loss of each are of course hard to know before actually trying.

utopia-rise / godot-kotlin-jvm

Memory leaks in the TransferContext class #615