LWJGL / lwjgl3

LWJGL is a Java library that enables cross-platform access to popular native APIs useful in the development of graphics (OpenGL, Vulkan, bgfx), audio (OpenAL, Opus), parallel computing (OpenCL, CUDA) and XR (OpenVR, LibOVR, OpenXR) applications.
https://www.lwjgl.org
BSD 3-Clause "New" or "Revised" License
4.76k stars 636 forks source link

Crash running a minecraft modpack on Linux #822

Open anagram3k opened 1 year ago

anagram3k commented 1 year ago

Version

3.2.2

Platform

Linux x64

JDK

Oracle JDK 17.0.4.1+1

Module

LWJGL Core

Bug description

It does not happen every time and usually if you try it again a couple of times or after rebooting, the error goes away. The core dump is not generated, but I was able to generate a backtrace using gdb.

Error log: hs_err_pid116949.log

Stacktrace or crash log output

#0  __GI___libc_read (nbytes=16, buf=0x7f1be5ec79f0, fd=0) at ../sysdeps/unix/sysv/linux/read.c:26
        sc_ret = -512
        sc_cancel_oldtype = 0
        __arg3 = <optimized out>
        _a2 = <optimized out>
        sc_ret = <optimized out>
        __value = <optimized out>
        sc_ret = <optimized out>
        __arg1 = <optimized out>
        _a3 = <optimized out>
        resultvar = <optimized out>
        __arg2 = <optimized out>
        _a1 = <optimized out>
#1  __GI___libc_read (fd=0, buf=0x7f1be5ec79f0, nbytes=16) at ../sysdeps/unix/sysv/linux/read.c:24
No locals.
#2  0x00007f1be6ca51b1 in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
No symbol table info available.
#3  0x00007f1be6f2848c in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
No symbol table info available.
#4  0x00007f1be6f8f500 in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
No symbol table info available.
#5  0x00007f1be6e09b2e in JVM_handle_linux_signal () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
No symbol table info available.
#6  <signal handler called>
No locals.
--Type <RET> for more, q to quit, c to continue without paging--c
#7  0x00007f1b0c61674b in ?? () from /home/luis/.ftba/bin/versions/fabric-loader-1.18.2-0.14.9/fabric-loader-1.18.2-0.14.9-natives-37811923637716/libglfw.so
No symbol table info available.
#8  0x00007f1b0c60eaa3 in glfwCreateWindow () from /home/luis/.ftba/bin/versions/fabric-loader-1.18.2-0.14.9/fabric-loader-1.18.2-0.14.9-natives-37811923637716/libglfw.so
No symbol table info available.
#9  0x00007f1bc93de97a in ?? ()
No symbol table info available.
#10 0x0000000000000000 in ?? ()
No symbol table info available.
(gdb) backtrace
#0  __GI___libc_read (nbytes=16, buf=0x7f1be5ec79f0, fd=0) at ../sysdeps/unix/sysv/linux/read.c:26
#1  __GI___libc_read (fd=0, buf=0x7f1be5ec79f0, nbytes=16) at ../sysdeps/unix/sysv/linux/read.c:24
#2  0x00007f1be6ca51b1 in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#3  0x00007f1be6f2848c in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#4  0x00007f1be6f8f500 in ?? () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#5  0x00007f1be6e09b2e in JVM_handle_linux_signal () from /usr/lib/jvm/java-17-openjdk-amd64/lib/server/libjvm.so
#6  <signal handler called>
#7  0x00007f1b0c61674b in ?? ()
   from /home/luis/.ftba/bin/versions/fabric-loader-1.18.2-0.14.9/fabric-loader-1.18.2-0.14.9-natives-37811923637716/libglfw.so
#8  0x00007f1b0c60eaa3 in glfwCreateWindow ()
   from /home/luis/.ftba/bin/versions/fabric-loader-1.18.2-0.14.9/fabric-loader-1.18.2-0.14.9-natives-37811923637716/libglfw.so
#9  0x00007f1bc93de97a in ?? ()
#10 0x0000000000000000 in ?? ()
NekoCaffeine commented 1 year ago

Maybe it has something to do with https://github.com/glfw/glfw/pull/2024

anagram3k commented 1 year ago

Running with LWJGL version 3.3.1 SNAPSHOT, looks like the problem is still present:

[18:15:22] [Render thread/INFO]: Backend library: LWJGL version 3.3.1 SNAPSHOT
[18:15:22] [Render thread/INFO]: Loaded client.properties
*** stack smashing detected ***: terminated
Process crashed with exitcode 6.
NekoCaffeine commented 1 year ago

Did you replace the jar file of lwjgl-glfw-native?

NekoCaffeine commented 1 year ago

Or can you attach a debugger to see the result of glfwGetVersion ?

anagram3k commented 1 year ago

the jar file of lwjgl-glfw-native?

I use the MultiMC option to change the library, it is using these files:

https://libraries.minecraft.net/org/lwjgl/lwjgl-glfw/3.3.1/lwjgl-glfw-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-glfw/3.3.1/lwjgl-glfw-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-jemalloc/3.3.1/lwjgl-jemalloc-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-jemalloc/3.3.1/lwjgl-jemalloc-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl/3.3.1/lwjgl-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-openal/3.3.1/lwjgl-openal-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-openal/3.3.1/lwjgl-openal-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-opengl/3.3.1/lwjgl-opengl-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-opengl/3.3.1/lwjgl-opengl-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-stb/3.3.1/lwjgl-stb-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-stb/3.3.1/lwjgl-stb-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-tinyfd/3.3.1/lwjgl-tinyfd-3.3.1-natives-linux.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl-tinyfd/3.3.1/lwjgl-tinyfd-3.3.1.jar
https://libraries.minecraft.net/org/lwjgl/lwjgl/3.3.1/lwjgl-3.3.1.jar

Do you know where I can find the native libraries compiled with debugging information (-g option on GCC)? It will help a lot.

SWinxy commented 1 year ago

LWJGL-CI/glfw has the build stuff. Could you use a debugger to see what is being passed to glfwCreateWindow()?

Spasi commented 1 year ago

Hey @anagram3k,

Issues like this are impossible to diagnose from our side. Also, it's almost certain not a bug in LWJGL, but rather one or more of the following:

In any case, since it happens on your machine (and not even consistently), you're the best (and maybe only) person able to diagnose it. The best approach would be creating an MCVE that somehow reproduces it. E.g. does it happen when creating a simple GLFW window without all the Minecraft stuff around it?

anagram3k commented 1 year ago

Hey @Spasi, sorry about the delay in responding. I haven`t much time to look into that.

So far I have discovered the bug is related to the Iris mod, it doesn`t happen without this mod, but I was unable to reproduce the problem only with this mod.

To trigger the bug, I have to run the full All of Fabric 5 modpack with "-XX:+ShowMessageBoxOnError" argument on the JVM to have time to attach gdb.

I found a way to run the modpack from the command line, using the lwjgl compiled with debug information. But it provided no extra information.

You said there are possible causes of this problem:

  1. A bug in fabric or whatever Minecraft modding is going on

Yes, there is probably a bug in the Iris mod, but this shouldn't make Java segfault.

  1. A bug in Minecraft

Also possible, but this also shouldn't produce a segfault.

  1. A random driver/software/hardware issue on user's machine

This is possible, but not probable. I play a lot of games and the system is stable and has no problems.

  1. a bug in GLFW

I think a bug in a Java program cannot create a segfault if no native code is used. Because the only native code resides in GLFW. The most probable cause is it. I am wrong in this assumption?

Anyway, if I can`t produce an MCVE or better information on this issue, I understand there is nothing you can do.

EDIT: I edited this post to make it clearer.

Spasi commented 1 year ago

Hey @anagram3k,

Yes, the assumption that JVM applications that use native libraries cannot cause crashes is wrong.

There are multiple different ways to crash the JVM using LWJGL APIs. This is not a problem with the JVM or LWJGL, it's inherent to the API design of most native libraries. They make crashes possible, so wrong API usage causes crashes. LWJGL is not designed to protect against such misuse, it's a low-level library providing access to multiple different, some of them massive, APIs. Even if exposing a safe API wasn't an impossibly complex effort, the result would not be a binding library, but something much higher level.

Astralchroma commented 1 year ago

Maybe it has something to do with glfw/glfw#2024

This seems to be the problem.

The best solution is to use MultiMC's (or forks of) "Use system installation of GLFW" option. This will not work for everyone as some distributions ship versions of glfw which are too old, to fix this create a jar file containing an updated libglfw.so file and then add that as a jar mod in MultiMC (or forks of) by using the "Add to Minecraft.jar" option in the "Version" tab.

anagram3k commented 1 year ago

@Peter-Crawley I already used other versions of glfw and it didn`t fix the problem. The most likely source of the problem is the Iris Mod. You can test without this mod on your machine and see if the problem is fixed.

Astralchroma commented 1 year ago

I did test with iris disabled, the issue persisted, is it possible you are experiencing a similar but different issues?

Regardless I am not really interested in debugging this further, I have a work around which works for me and a friend who was experiencing the same issue, that's good enough for me, I just listed our solution in hopes of helping others.

Frontear commented 1 year ago

This issue is GLFW specific. Try building the latest GLFW library and linking it to your MC via -Dorg.lwjgl.glfw.libname=/usr/lib/libglfw.so (or wherever you save it after building). So far that's fixed the issue on everyone's end that I've seen

Kichura commented 1 year ago

Update to LWJGL 3.3.1/3.3.2 and see if 1.18.2 - 1.19.4 is able to run properly.