godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.67k stars 20.11k forks source link

Dev assert on editor exit: 1 shaders of type ParticlesShaderRD were never freed #93344

Open akien-mga opened 2 months ago

akien-mga commented 2 months ago

Tested versions

System information

Fedora Linux 40 (KDE Plasma) - Wayland - Vulkan (Forward+) - dedicated AMD Radeon RX 7600M XT (RADV NAVI33) - AMD Ryzen 7 7840HS w/ Radeon 780M Graphics (16 Threads)

Issue description

Upon closing the editor with a medium sized project using GPUParticles (W4's Legend of the Nuku Warriors demo, not public yet), I get this dev assert (in a dev_build=yes build):

ERROR: 1 shaders of type ParticlesShaderRD were never freed
   at: ~ShaderRD (./servers/rendering/renderer_rd/shader_rd.cpp:845)
ERROR: FATAL: DEV_ASSERT failed  "_first == nullptr" is false.
   at: ~List (./core/templates/self_list.h:168)

Thread 1 "godot-git" received signal SIGILL, Illegal instruction.
0x000000000a083309 in SelfList<RendererRD::MaterialStorage::Material>::List::~List (this=0xbffd6e8, __in_chrg=<optimized out>) at ./core/templates/self_list.h:168
168                             DEV_ASSERT(_first == nullptr);
Missing separate debuginfos, use: dnf debuginfo-install alsa-lib-1.2.12-1.fc40.x86_64 bzip2-libs-1.0.8-18.fc40.x86_64 dbus-libs-1.14.10-3.fc40.x86_64 elfutils-libelf-0.191-4.fc40.x86_64 expat-2.6.2-1.fc40.x86_64 flac-libs-1.4.3-4.fc40.x86_64 fontconfig-2.15.0-6.fc40.x86_64 freetype-2.13.2-5.fc40.x86_64 glib2-2.80.2-1.fc40.x86_64 glibc-2.39-15.fc40.x86_64 graphite2-1.3.14-15.fc40.x86_64 gsm-1.0.22-6.fc40.x86_64 harfbuzz-8.4.0-1.fc40.x86_64 lame-libs-3.100-17.fc40.x86_64 libX11-1.8.9-1.fc40.x86_64 libX11-xcb-1.8.9-1.fc40.x86_64 libXau-1.0.11-6.fc40.x86_64 libXcursor-1.2.1-7.fc40.x86_64 libXext-1.3.6-1.fc40.x86_64 libXfixes-6.0.1-3.fc40.x86_64 libXi-1.8.1-5.fc40.x86_64 libXinerama-1.1.5-6.fc40.x86_64 libXrender-0.9.11-6.fc40.x86_64 libbrotli-1.1.0-3.fc40.x86_64 libcap-2.69-8.fc40.x86_64 libdrm-2.4.121-1.fc40.x86_64 libffi-3.4.4-7.fc40.x86_64 libpng-1.6.40-3.fc40.x86_64 libsndfile-1.2.2-2.fc40.x86_64 libstdc++-14.1.1-5.fc40.x86_64 libvorbis-1.3.7-10.fc40.x86_64 libwayland-client-1.22.0-3.fc40.x86_64 libxcb-1.17.0-1.fc40.x86_64 libxkbcommon-1.6.0-2.fc40.x86_64 libxml2-2.12.8-1.fc40.x86_64 libzstd-1.5.6-1.fc40.x86_64 mesa-vulkan-drivers-24.1.1-5.fc40.x86_64 mpg123-libs-1.31.3-4.fc40.x86_64 opus-1.5.1-1.fc40.x86_64 pcre2-10.42-2.fc40.2.x86_64 systemd-libs-255.7-1.fc40.x86_64 vulkan-loader-1.3.283.0-2.fc40.x86_64 xz-libs-5.4.6-3.fc40.x86_64 zlib-ng-compat-2.1.6-5.fc40.x86_64
(gdb) bt
#0  0x000000000a083309 in SelfList<RendererRD::MaterialStorage::Material>::List::~List (this=0xbffd6e8, __in_chrg=<optimized out>) at ./core/templates/self_list.h:168
#1  0x000000000a044757 in RendererRD::MaterialStorage::~MaterialStorage (this=0xbffd470, __in_chrg=<optimized out>) at ./servers/rendering/renderer_rd/storage_rd/material_storage.cpp:1187
#2  0x0000000009f497f2 in memdelete<RendererRD::MaterialStorage> (p_class=0xbffd470) at ./core/os/memory.h:116
#3  0x0000000009f2feee in RendererCompositorRD::finalize (this=0xbce0060) at ./servers/rendering/renderer_rd/renderer_compositor_rd.cpp:157
#4  0x0000000009d9029d in RenderingServerDefault::_finish (this=0xbfb4020) at ./servers/rendering/rendering_server_default.cpp:241
#5  0x0000000009d90589 in RenderingServerDefault::finish (this=0xbfb4020) at ./servers/rendering/rendering_server_default.cpp:273
#6  0x0000000005bce261 in finalize_display () at main/main.cpp:347
#7  0x0000000005beb2b6 in Main::cleanup (p_force=false) at main/main.cpp:4319
#8  0x0000000005b2b011 in main (argc=2, argv=0x7fffffffd6f8) at platform/linuxbsd/godot_linuxbsd.cpp:89

Steps to reproduce

TBD, load project that creates a ParticlesShaderRD in the editor, and close the editor.

Minimal reproduction project (MRP)

Not tried to extract one yet, but @clayjohn @RandomShaper @DarioSamo have access to the project where I can reproduce this.

clayjohn commented 2 months ago

This is likely another multi threaded loading issue.

Setting threading/worker_pool/max_threads to 1 should be enough to avoid the assert, but it will also increase loading time significantly and may increase frame time as well.

akien-mga commented 2 months ago

I think I tried that and it didn't resolved the assert. But I'll test again tomorrow to confirm. Note that this happens in the editor, not at runtime

Edit: Tested, I confirmed that even with threading/worker_pool/max_threads set to 1, the editor crashes on exit.

RandomShaper commented 2 months ago

The list that contains an element when it shouldn't (at engine shutdown) is MaterialStorage::material_update_list. In normal circumstances, at that point all the materials would have been unregistered so it would be empty. That's true even if the last frame before shutdown adds some materials to that list because of broken dependency chains (e.g., a texture is released so a uniform set is freed so a callback is run to update the material). If that happens (which is only natural), the updates are added to the list, but, even if the list is not processed anymore, as the materials are eventually destroyed, they are removed from the list.

However, in this project, by the time MaterialStorage is being destroyed (quite later in engine shutdown), there are still some cached resources around, including a scene (res://scenes/combat/ui/damage_number.tscn) that happens to contain a ParticlesProcessMaterial subresource. That resource happens to be the only one in the material_update_list. It seems that the project (maybe some of its plugins) is creating a circle of references preventing certain resources from being released.

It's interesting that, in a case like this, a vicious circle of dependencies is not only causing memory leaks but an ill situation where the material exists at a point where it shoudn't be possible and even scripts are kept in memory once the GDScript engine has been terminated.

The error message itself could be silenced by explicitly clearing the list before its destructor runs. However, being a dev-only check, I believe it's better not to fight the warning, as that'd be fighting the symptom without curing the illness. That said, it depends on what's the rule here: if the engine is supposed to allow projects to leak resources gracefully, then the explicit clear should be added.