utopia-rise / godot-kotlin-jvm

Godot Kotlin JVM Module
MIT License
585 stars 38 forks source link

StringName caching and "String only" api methods #499

Open CedNaru opened 11 months ago

CedNaru commented 11 months ago

Godot 4 relies a lot on StringName, a simple wrapper around a pointer to the actual String uniquely stored inside Godot itself. The idea behind this is extremely similar to the JVM String pool, unfortunately the 2 are not compatible with each other.

We now have a lot of methods in the API using StringNames instead of Strings. This change has 2 main consequences:

My suggestion is the automatic caching of Strings into StringNames whenever we call asStringName(). A new Map<String, StringName> will be added in the MemoryManager. Each time a String conversion is queried, we first check if the String is already present in the Map. If yes, we return it, if no, we call back to the C++ to create/get the StringName from its own pool.

To avoid filling the memory too much and holding now unused StringNames. I suggest for the implementation to be done using a LinkedHashMap which preserve insertion order and limit it to a specific size. The size can be configured by a JVM argument, the same way the ByteBuffer is (also related to Strings by the way). Each time, a StringName is fetched from the Map, it is removed and added back to the map to make it the "most recently accessed". When the map is full and a new StringName need to be added, we just remove the oldest one (first) in the map. This would allow an efficient way to keep StringNames around without relying on frequent JNI calls. On top of that, Strings are immutable, and their hash is only computed once, making the HashMap access fast.

The cost of that Quality of Life and optimization would be:

Even if we cache StringNames, we still require manual conversion from the users. GDScript allows the use of String and StringNames seamlessly. We can reproduce a part of that behaviour by adding a new feature to the api gen. The idea is that when any function from the api got one or several StringName argument, we also generate a version of it replacing all StringNames by Strings and then doing the conversion automatically.

In the case of function such as foo(StringName, StringName) exists, we would only generate a "String only" version of it. The result would be that only foo(String, String) would be generated, no foo(String, StringName), foo(StringName, String)

PS: This idea can also be extended to NodePath as well.

apkelly commented 10 months ago

@CedNaru I'd like to help with this issue, do you think it'd be a good entry-level one to tackle? I've done some codegen stuff IRL so adding additional outputs using KotlinPoet for the foo(String, String) seems within my skillset, as does adding a cache to the StringNames.

CedNaru commented 10 months ago

We were planning this enhancement in 2 parts. First the caching, second the api generation that makes use of the caching. All of that is relatively straightforward to implement but still requires understanding how the api generation currently works.

@piiertho is the one in charge of the API gen, you can see that with him on Discord. The subtlety here is that we were wondering about adding some "multiple passes" system in the generator, so the models used for the generation would be modified/merged/forked with each pass, simplifying the code a lot. The "String only" methods would basically become one of those passes.

apkelly commented 10 months ago

Having a quick look at this one, the cache should be pretty simple (I think).

class LRUCache<K, V>(private val capacity: Int) : LinkedHashMap<K, V>(capacity, 0.75f, true) {
    override fun removeEldestEntry(eldest: MutableMap.MutableEntry<K, V>?): Boolean {
        return size > capacity
    }
}

private val stringNameCache = LRUCache<String, StringName>(50)

fun String.asStringName(): StringName {
    return stringNameCache.getOrPut(this) {
       StringName(this)
    }
}

The code above should work, I just need to play some more with the project compilation to test it out, it then becomes a question of what size do we think is sensible for a default cache capacity.

apkelly commented 10 months ago

Re-reading the initial comment, I've moved the cache into the MemoryManager and exposed this to the String.asStringName() function, I've also cleared the cache during the MemoryManager.cleanUp() function.