MovingBlocks / Terasology

Terasology - open source voxel world
http://terasology.org
Apache License 2.0
3.68k stars 1.34k forks source link

Crash on launch with AMDGPU #2615

Open ghost opened 7 years ago

ghost commented 7 years ago

Log at http://pastebin.com/8CzSHUsh

I run Arch Linux x64, the crash occurs on the latest builds of both stable and development, whether installed via the AUR or the official launcher.

My GPU is an RX480, running the AMDGPU driver, with the additional proprietary blob (AMDGPU-PRO). If it matters, I use i3wm as my window manager. I appear to be using OpenJDK8 as the Java VM, with OpenJFX installed. Everything looks up to date.

If it affects anything, the initialisation screen (with the Terasology logo) appears on startup, and drops straight into the log message. There's just one log available to view.

skaldarnar commented 7 years ago

Hey @DJspy109, thanks for the report.

Unfortunately, I cannot view the log file on pastebin (has been removed). Could you re-upload it please or post it as code snippet in this issue? In the meantime I'll check it out on my Arch Linux 😉

ghost commented 7 years ago

http://pastebin.com/yR41F9Jp

Here you go, think I might have gotten flagged for spam by pastebin and forgot to do the captcha.

EDIT: Strange, I tried it on another Arch box, and it worked perfectly. This one appears to use radeon as the GPU driver. Which is a shame, because the system it does work on can barely run the game.

skaldarnar commented 7 years ago

As the log states, this is an issue with GPU - although that used to appear on integrated graphics on laptops... Your system should be able to run the game without a hassle, so we should investigate it more.

I have to ask if your GPU drivers are up to date, just to make sure. Java 64-bit is newest, and the game starts nicely on my own Arch machine with OpenJDK.

Pinging @emanuele3d @immortius @msteiger

emanuele3d commented 7 years ago

@DJspy109: the log ending as it does on that line really hints to some kind of problem with the GPU or its drivers. I confirm making sure the drivers are up to date has helped in some circumstances, as skaldarnar suggested, but not always.

Are you sure your system is using the RX480 instead of an integrated GPU? If an integrated GPU exists on your motherboard Terasology might be trying to use that one instead of the discrete GPU. The Radeon software should give you the option to override this behavior and always use the discrete GPU when running Terasology. I had to do the same on my (Windows-based) Alienware laptop.

ghost commented 7 years ago

I'm certain I'm running from the discrete GPU; the onboard graphics are disabled from the BIOS, as well as the display being plugged into the RX480. Uninstalling AMDGPU-PRO appears to change nothing, the game is still not launching.

The drivers are definitely up to date now, I just reinstalled to check as well.

The relevant output of lspci -v is at this link: http://pastebin.com/xYKT8JJk

This is the only device showing as VGA compatible controller.

If it helps, I could try doing a quick reinstall to a very basic setup and running it from there, to possibly eliminate any other software issues.

emanuele3d commented 7 years ago

Sure, if you have time and will, please do.

From my perspective the problem is that we have no error message to go with. The line where the log stops is the line where the OpenGL context is created. When the context is created we do some GPU/drivers checks to identify potential issues before we start the game proper, but without the context it's like trying to find something in the dark. So, right now I wouldn't know how to debug it.

ghost commented 7 years ago

Reinstalled Arch in a separate partition, and the issue occurs. Still using AMDGPU, and it fails in both i3 and KDE, so I don't think there's an issue there. This is the only log in the logs directory under ~/.local/share/terasology/logs. Running from the terminal produces the same logs, but inside the terminal itself. lspci -v produces the same output as before.

It also appears that AMDGPU is the only driver with support for Polaris cards, other than (of course) AMDGPU-PRO. The machine that Terasology does run on actually has TeraScale 1 on an integrated GPU, and uses the ATI driver.

I wonder if it's possible for anyone else with a Polaris card under linux to confirm whether it's a driver issue or if I just have a faulty card?

I might try loading Kubuntu in a bit, to see if Arch adds to the problem.

Taose commented 7 years ago

@emanuele3d would it be worth going through the code and adding in more relevant checks and outputs for the errors, submitting it to the dev version and then rapid testing that? It's not as though we need the game up and running, just getting through the basics.

Speaking of which...does the dev version have the same error?

emanuele3d commented 7 years ago

@Feralmouse: the problem lies in what "relevant checks and output for errors" means. If you look at LwjglGraphics.initDisplay() there are plenty of opportunities/checks for the code to generate an exception and report it in the output. The problem is: it fails silently and I don't know why.

I will double-check what would be the next line of output to see if there is any other part of the code that can fail silently before that line.

Regarding the develop branch, the version we use for development, I'm pretty sure the problem is there too: the area of the code where the problem is likely to be is fairly stable and we don't touch it much. So, I doubt there is much difference between the develop branch and the master branch used for distribution.

Sorry I can't be more positive at this stage.

ghost commented 7 years ago

Yep, can confirm. Both stable and development builds crash on load, through the official launcher or via the AUR. Looking at the code, it appears (to my very basic Java skills) that the failure occurs at line 221 in LwjglGraphics.java, as XTitle doesn't report the window title having changed due to line 223. So it's occuring before all those try/catches.

Changing debugEnabled in the config (~/.terasology/config.cfg) to true doesn't appear to do anything at all. I'm not sure if you could really add anything else to that to alter output, other than possible a try/catch around that line to check if that definitely is the problem. The config file that this is calling for states "displayModeSetting": "${engine:menu#video-windowed)",; I don't know if this is a proper config or not. Editing this doesn't appear to change the outcome of launching.

emanuele3d commented 7 years ago

Thank you for suggesting an area for further debugging @DJspy109. That area however is within a try/catch block, so I don't understand why something in there can fail without triggering an output-producing exception.

Pinging @immortius, @flo and @msteiger for ideas.

Meanwhile (but this would be going the extra-lightyear rather than the extra-mile) one thing you could do is to get our developer setup up and running on your machine and use an IDE (i.e. IntelliJ or Eclipse) to find the exact line where everything goes pear-shaped.

skaldarnar commented 7 years ago

@emanuele3d indeed, that's interesting. setDisplayModeSettings(…) again contains a try-catch block which, at least, show the occuring error. It seems like there is some hard crash escaping Java, thus, the catch blocks ar never reached.

@DJspy109 If you could check out the issue when run from source as suggested that is highly appreciated. Don't hesitate to ask questions here (or on IRC, Forum) if you run into problems with the setup ;-)

Cervator commented 7 years ago

Thanks for the report and keeping at it @DJspy109 :-)

FYI the debugEnabled you found in config.cfg is a debug overlay rendered in-game that provides a bunch of extra info about the running game. Doesn't affect logging and doesn't help you much since you can't get to that point.

Log level is changed elsewhere but indeed would rely on there being something more to log in the first place.

ghost commented 7 years ago

Sorry for taking so long to respond, I've had a really busy week.

Assuming I'm running it correctly from source (via ./gradlew game), the issue is still present. No other information is given in the terminal than what was already given from the Error Log.

Running from a Kubuntu LiveCD reveals yet again the same problem, so I think there's definitely some kind of error between LWJGL and AMDGPU.

benneti commented 7 years ago

Can confirm that its not your card, I have the same problem on amdgpu + RX480. Would it help if I get my logs, too?

Cervator commented 7 years ago

Heya @benneti - it wouldn't hurt to have more logs, would appreciate it, even if it might end up being the same thing. More knowledge and confirmation is a good thing :-)

benneti commented 7 years ago

pastebin this is what happens in the terminal... Is there a logfile, too? I also just noticed, that it might be a problem related to the (steam) controller enabled. I use SC Controller (github link) and when I disable the Controller (do not start the userspace driver the game launches without an issue) but with the Controller enabled It shows the start menu and after I click on start, the Music stops and nothing happens. After this i killed the driver server, and the game exited, too. Maybe this is a different Issue. Thanks for your effort, I would love to try the game with the controller

ghost commented 7 years ago

Looks to be a different issue to me. I don't have a steam controller, and it appears that yours crashes much later in the process.

By the way, my issue still occurs, with the latest mesa and everything.

benneti commented 7 years ago

OK, I should probably open a new issue. Strange thing is I think for the graphics part I should have a very similar setup to yours, as I am on arch, too. And the only thing wich is different from the repos is my kernel

Cervator commented 7 years ago

Very interesting interaction problem. Might be tricky to replicate elsewhere (know anybody else with Arch and a Steam Controller?), but it would be good to document separately. Appreciate the info - if you're curious to dig deeper we'd be happy to have more yet :-)

benneti commented 7 years ago

I can try it on my laptop, too. But I am kinda busy right now, I think in two weeks it should not be a problem to try a few different things and create a new issue with more details. :)