Open LemonadeFlashbang opened 11 months ago
Can you try compilling a debug template export and use it when exporting the game in debug mode? It may provide you with some more useful information.
What is in your custom build (i.e. what changes are there versus vanilla Godot?)?
How was it compiled, what SCons
args did you use? Are you compiling in any SSE / special instructions that might not be present on these CPUs? Any shared libraries that might not be present?
Usually different CPU do not cause crashes. GPUs on the other hand often do, especially if it is trying to use integrated graphics etc. Are you using custom shaders?
Agree with @rsubtil that compiling debug export template and running on the offending hardware will probably be quickest to indicate the bug, as bisecting by detective work can be quite tricky.
UPDATE: I also noticed on Page 3 of the thread there is a quote with a shader error:
SHADER ERROR: Unknown identifier in expression: scale_texture at: (null) (:153)
If the shader is not compiling this could cause a crash, and this could be silent if some GPUs do not present the error log correctly. In fact we recently merged a change in 3.6 to correct for drivers which output error log incorrectly:
https://github.com/godotengine/godot/pull/84741
I don't know whether you were able to reject this as the cause of the crash. I also would try running without async shader compilation if shaders is possibly the problem.
@lawnjelly : There are two modifications. One is a set of updates to labels in order to correct improper wrapping in CJK languages and with BBCodes. You can see these changes here. It is also compiled with Spine Runtimes present. Those changes are viewable in the Spine Godot github repo.
I don't believe the core issue is shaders but it's not impossible. There's a method at startup that loads all shaders in a dummy scenes then frees them to avoid the shader instancing lag that's otherwise present. The crash is present in builds that still have this method enabled. There's a couple whiteout shaders that aren't loaded at the start, but the fact that some players are able to progress past character select and only crash when the map is loaded implies to me that if it's a shader issue, it's not consistent.
Edit: The scons args are viewable in the build.sh script in Godot-Spine. It's the default platform=... with custom_modules=yes. There are no other changes going on in my build except for the ones just discussed.
I found an end user complaining of Godot related crashing with an AMD card on Reddit. In case that's helpful- it may not be the same issue since they have a different GPU.
@rsubtil : I'll try compiling a version with debug templates turned on and I'll comment again once I've heard back. Given the nature of the crash, I'm not expecting any new information.
@LemonadeFlashbang Would be nice to see your shader stuff PR'ed to the core repository!
@LemonadeFlashbang Would be nice to see your shader stuff PR'ed to the core repository!
It's not something extensible to other projects or an engine modification. It's a very clumsy "load all the combat particles, plus all the dialogue particles, plus all the ...." function that frees everything after an idle frame. It "solves" shader instancing lag by just moving it to startup.
It's a very hacky, brute-force solution but it works.
Some updates- I believe the root cause is related to VRAM usage, but why it only causes problems on AMD machines remains a mystery to me. Using a debug build did not resolve the problem.
Here's a screenshot of the game's RAM usage on my local computer
The game needs ~350 MB of RAM to function, however we can see that the committed memory, which includes VRAM, is substantially higher (~4 GB).
Now here's a screenshot of the game's RAM usage on another computer, with an AMD Ryzen 5, at startup:
1-2 GB of RAM during the game's launch, and then a crash.
I decided to test the VRAM/Committed RAM hypothesis by running every single texture in the game through VRAM compression. This is not recommended for 2D games in the docs due to the artifacting, and has resulted in some pretty large impacts to the game's visual fidelity. The game's file size increased dramatically, however the game now only uses 2 GB of committed memory.
After that I had a user test. They were able to boot the game- but their task manager recorded the game taking up 2 GB of RAM.
Here's the machine's specs:
And here's the equivalent screenshot on my computer.
Similar memleak issues were reported with ANGLE backend - which backend is the problematic device using?
I'm not sure I understand the question, forgive my ignorance. Google shows me ANGLE is a backend for web apps? This is a desktop application.
I'm not sure I understand the question, forgive my ignorance. Google shows me ANGLE is a backend for web apps? This is a desktop application.
Godot has ANGLE support on Windows/macOS since 4.2, although it's only used on old AMD GPUs by default since 4.2.1 (older than GCN 4.0). This was done because OpenGL support in old AMD GPus is pretty bad on Windows, so running a Direct3D 11 translation layer gives better results in terms of reliability.
I'm not sure I understand the question, forgive my ignorance. Google shows me ANGLE is a backend for web apps? This is a desktop application.
Godot has ANGLE support on Windows/macOS since 4.2, although it's only used on old AMD GPUs by default since 4.2.1 (older than GCN 4.0). This was done because OpenGL support in old AMD GPus is pretty bad on Windows, so running a Direct3D 11 translation layer gives better results in terms of reliability.
Sorry, I'm still a bit confused.
The game was developed in Godot 3.5, which seems like it's before ANGLE's support. What should I ask the users to best answer your question? Where would they find the information?
The GPU's vary. I have one reported instance with an NVIDIA GPU, and another with an AMD GPU. It's just the CPU that all the devices seem to share.
Can you check the Video RAM
panel for more details on what objects are filling up the most memory? There might be a clue as to why the usage is so high.
I'm not familiarized with your game, but assuming it's essentially a 2D visual novel game, 4GB VRAM usage looks unreasonably high. That would also explain the high RAM usage on other setups, since AFAIK regular RAM is used when the VRAM becomes full (especially relevant on systems with integrated graphics where there is no VRAM)
Ah, I see. It doesn't explain the one device with an NVIDIA 3080 GPU - but that might be it for the others. The game loads everything into relevant DBs (item database, skill database, enemy database, etc.,) for access. I'm guessing that's probably loading all the images into VRAM, and these machines might be more budget systems.
There are some 20MB files which are models for the spine animations. Here's one example of such a mesh:
And how it appears in game
Can't really cut that down without deforming the images.
There's a large number of 10MB files, which are files for the ending. Each ending has its own unique art- and there's over 100 of them. An example of one of those files:
All together those probably account for ~1.5 GB of VRAM. 1 GB is probably in the ending art.
The remaining half is just in asset volume. Backgrounds, characters, items, skills, VFX textures, etc.,
I'm going to leave the ticket open until I get confirmation from the user with the 3080 that the issue is fixed, after which I'll close this. Until then, I'll start refactoring the database systems to see if there's a way to load some helper characteristics (things like IDs and if the item is in the random pool) without loading the attached image data.
Compressing everything has allowed the game to work for one user's machine, but not another.
The user who cannot get the game to run has an NVIDIA 3080 GPU, 26 GB Memory, and 10 GB of VRAM. What DID work for this user was running a debug version of the game.
So it's possible there's two separate issues with similar impacts. Leaving this ticket open since we're seeing crashes that aren't purely memory related.
Tested versions
Godot 3.5.3 custom
System information
AMD Ryzen 7, AMD Ryzen 5
Issue description
Summary My game crashes when run on computers with AMD Ryzen cards.
Reducing the filesize, reverting to single threaded loading, and compatibility modes have failed to fix the issue. I believe the root cause is an engine incompatibility with AMD hardware.
Where the game crashes is variable for different computers. Most machines crash when the game is loaded and the title screen appears, but I've heard of a single case where a user crashes when the main map is instanced (after character select).
Removing the number of objects from the game's initial load will allow it to run on these machines, but it will then crash when trying to instance the next scenes.
Full Context I'm the developer behind Doomsday Paradise. In November, I launched my game- and a user reported the game was crashing on startup. The user sent a list of specs over, which look like the following:
The user was on an older laptop, and eventually switched devices. The new device had no issues. At the time, I thought the issue might be RAM related.
Since then 3 other users have approached about game crashes. Mostly on game start. Every single one of them is using an AMD Ryzen processor. Specs attached. Here is a discussion thread on the Steam Discussion Boards about the issue.
While most devices have only 8 GB of RAM, one device has 32 GB. The game shouldn't need more than about 300 MB of RAM. The VRAM requirement is higher- a little over 3 GB of VRAM.
In the Discussion Board I've tried my best to isolate the actual source by having some users test modified game builds. Things we've tried:
Compressing every single image in the game, reducing the total size to 600 MB- smaller than the demo version. Notably, they can run the demo but they could not runt he compressed version of the game.
Changing the objects loaded at startup. Different machines crash at different points. One machine is able to load through the title screen but crashes when the main map is instanced. One machine can get to the title screen, but only if I don't instance character select. Other machines crash if I try to instance the title screen.
Deleting large swaths of content, adding additional logging to every single startup function, disabling background loading, etc.,
Disabling multithreading and using single threaded loading.
Enabling compatibility mode for rendering.
The issue seems to be related to object instancing, but because machines are crashing at different points I can't replicate it. In addition, I don't have a machine with an AMD Ryzen card- so I'm not easily able to create a repro project.
For purposes of debugging this, I can supply a game key. Alternatively, if anyone can help me figure out what logging / testing steps to take next, I'm happy to perform the relevant diagnostics myself and have a couple users who have been extremely patient and willing to help.
Currently, no issues appear in the logs. Users just experience a sudden termination.
Steps to reproduce
Minimal reproduction project (MRP)
Not available since I can't locally test it. I can try to reproduce a project with the current size profile if necessary.
During the title screen, the following are the game's stats: