utopia-rise / godot-kotlin-jvm

Godot Kotlin JVM Module
MIT License
560 stars 38 forks source link

Fatal Error in native method: bad global or local ref passed to JNI #617

Open MartinHaeusler opened 2 months ago

MartinHaeusler commented 2 months ago

The following happened randomly to me today while test playing my game:

FATAL ERROR in native method: Bad global or local ref passed to JNI
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x9b28bf]  ReportJNIFatalError+0x2f
V  [libjvm.so+0x9bbc6a]
V  [libjvm.so+0x9be5e6]  checked_jni_DeleteGlobalRef+0x96
C  [godot.linuxbsd.editor.x86_64+0xca05da]

Unfortunately that's all I have, there were no more logs available from this particular crash. I don't know which JNI method it was, nor which reference was problematic. I thought I should report this anyway, perhaps we can at least improve the logging to include the called method name or other diagnostic information.

piiertho commented 2 months ago

We will add debug symbols for debug build in a next release.
Which version are you using ? We can still disassemble binary and look where it crashes using adresse

MartinHaeusler commented 2 months ago

@piiertho I'm currently on 0.8.1-4.2.0. Hoping for a new version that targets Godot 4.2.2 soon :)

CedNaru commented 2 months ago

It at least tells us it's related to deleting a global ref from the c++ side, and if it rarely and randomly happens, it means it's a race condition. Most likely, it's the issue I talked about here: https://github.com/utopia-rise/godot-kotlin-jvm/issues/616

MartinHaeusler commented 2 months ago

Got another one...

FATAL ERROR in native method: Bad global or local ref passed to JNI
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x9b28bf]  ReportJNIFatalError+0x2f
V  [libjvm.so+0x9bbb70]  checked_jni_NewWeakGlobalRef+0x140

This is quite concerning to me because it seems totally non-deterministic and I can't do anything about it. During play testing it's a minor annoyance, but for a release candidate that's really bad.

It could also be that I made some coding error, but at the moment I can't really tell because I have no diagnostic information.

CedNaru commented 2 months ago

It's very unlikely to be your mistake, users code is not supposed to have any influence on how the module handle the JNI ref. It's random because it's probably a race condition between Godot main thread and the MemoryManager thread we use. I have a few ideas on the origin of it.

MartinHaeusler commented 2 months ago

@CedNaru I saw your proposal in #618 and for what my opinion is worth, the proposal makes sense to me. Maybe it could also be worth a try to see how it's done for the Godot C# module? After all, in the grand scheme of things, the CLR and the JVM are similar beasts and the CLR has its own GC as well.

Any attempt at fixing this issue would be highly appreciated, as it ultimately is a showstopper. Individual issues I can work around; random crashes are a different story.

CedNaru commented 2 months ago

I already checked how C# does it and actually, the second point of my proposal is to move away from the way C# is managing its memory (because we are currently doing the same things as them).

The way they handle bindings is not right in my opinion. Like I explain in the proposal, this model make it so that the VM (CLR or JVM) is the one with the last word on the memory. If we were to run an experiment and use C# and Kotlin in the same Godot project, you would end up with a lot of memory leaks because the 2 VM would prevent each other from deleting RefCounted instances (if they are bound to the 2 VM at the same time). We used their code a lot as a reference when starting this project, but we tend to ignore it now that we got more experience with Godot internals. Their module is quite bloated in an inefficient way.

Both are VM but the way they do interop is totally different, there are things you can afford to do in C#, but can't with the JVM because of JNI performances. Our main bottleneck regarding performances is the number of JNI calls, the more of them we can avoid, the better the performances will be.

CedNaru commented 1 month ago

626 is an attempt at fixing that issue. Given that it's random and rare, it's hard to figure out what is the exact cause.

I went over all the code that manages the memory between Godot and JVM and found several parts that will rarely but eventually trigger bugs. So no guarantee sadly. I'll keep this issue open until the next release and you make sure the issue is gone.

kostaskougios commented 1 week ago

Here is my stacktrace, hope it helps:

FATAL ERROR in native method: Bad global or local ref passed to JNI
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x444c28]  jniCheck::validate_handle(JavaThread*, _jobject*)+0x68
V  [libjvm.dylib+0x45f0d8]  checked_jni_NewWeakGlobalRef+0xc0
C  [Godot+0x88bf3c]  JavaInstanceWrapper::swap_to_weak_unsafe()+0x34
C  [Godot+0x898f2c]  KotlinBinding::refcount_decremented_unsafe()+0x40
C  [Godot+0x8991ac]  KotlinBindingManager::_instance_binding_reference_callback(void*, void*, unsigned char)+0x24
C  [Godot+0x38377e4]  RefCounted::unreference()+0xc4
C  [Godot+0x1cdf6cc]  Viewport::_process_picking()+0x1b0c
C  [Godot+0x3835d28]  ObjectDB::cleanup()+0xa348
C  [Godot+0x381ce70]  Object::callp(StringName const&, Variant const**, int, Callable::CallError&)+0x19c
C  [Godot+0x1cbade4]  SceneTree::call_group_flagsp(unsigned int, StringName const&, StringName const&, Variant const**, int)+0x604
C  [Godot+0x1cbc050]  SceneTree::physics_process(double)+0xc8
C  [Godot+0x3e2460]  Main::iteration()+0x230
C  [Godot+0x39179c]  OS_MacOS::run()+0x74
C  [Godot+0x3bfc0c]  main+0x130
C  [dyld+0x60e0]  start+0x938
kostaskougios commented 1 week ago

just had one more crash, hope this stacktrace will come handy:

FATAL ERROR in native method: Bad global or local ref passed to JNI
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.dylib+0x444c28]  jniCheck::validate_handle(JavaThread*, _jobject*)+0x68
V  [libjvm.dylib+0x45f0d8]  checked_jni_NewWeakGlobalRef+0xc0
C  [Godot+0x88bf3c]  JavaInstanceWrapper::swap_to_weak_unsafe()+0x34
C  [Godot+0x898f2c]  KotlinBinding::refcount_decremented_unsafe()+0x40
C  [Godot+0x8991ac]  KotlinBindingManager::_instance_binding_reference_callback(void*, void*, unsigned char)+0x24
C  [Godot+0x38377e4]  RefCounted::unreference()+0xc4
C  [Godot+0x1cec258]  Viewport::push_input(Ref<InputEvent> const&, bool)+0x34c
C  [Godot+0x1d21dcc]  Window::_window_input(Ref<InputEvent> const&)+0x2d8
C  [Godot+0x34d54a8]  Input::~Input()+0x661c
C  [Godot+0x398678]  DisplayServerMacOS::_dispatch_input_event(Ref<InputEvent> const&)+0x508
C  [Godot+0x3983c8]  DisplayServerMacOS::_dispatch_input_event(Ref<InputEvent> const&)+0x258
C  [Godot+0x34ca098]  Input::_parse_input_event_impl(Ref<InputEvent> const&, bool)+0x10cc
C  [Godot+0x34c78bc]  Input::flush_buffered_events()+0x88
C  [Godot+0x3b0328]  DisplayServerMacOS::process_events()+0x28c
C  [Godot+0x391790]  OS_MacOS::run()+0x68
C  [Godot+0x3bfc0c]  main+0x130
C  [dyld+0x60e0]  start+0x938
kostaskougios commented 1 week ago

one more:

FATAL ERROR in native method: Bad global or local ref passed to JNI
    at godot.core.memory.MemoryManager$MemoryBridge.bindInstance(Native Method)
    at godot.core.memory.MemoryManager.bindNewObjects(MemoryManager.kt:235)
    at godot.core.memory.MemoryManager.manageMemory(MemoryManager.kt:213)
    at godot.core.memory.MemoryManager.run(MemoryManager.kt:205)
    at godot.core.memory.MemoryManager$$Lambda/0x00000f0000095688.run(Unknown Source)
    at java.util.concurrent.Executors$RunnableAdapter.call(java.base@22.0.1/Executors.java:572)
    at java.util.concurrent.FutureTask.run(java.base@22.0.1/FutureTask.java:317)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(java.base@22.0.1/ScheduledThreadPoolExecutor.java:304)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(java.base@22.0.1/ThreadPoolExecutor.java:1144)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(java.base@22.0.1/ThreadPoolExecutor.java:642)
    at java.lang.Thread.runWith(java.base@22.0.1/Thread.java:1583)
    at java.lang.Thread.run(java.base@22.0.1/Thread.java:1570)