godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.34k stars 21.06k forks source link

LightmapGI baking crashes on higher quality settings when using lights with a size larger than 0 #91837

Open BernhardMt opened 5 months ago

BernhardMt commented 5 months ago

Tested versions

4.2.stable, 4.3-dev6

System information

Godot v4.2.stable - Windows 10.0.19045 - Vulkan (Forward+) - dedicated NVIDIA GeForce GTX 1080 (NVIDIA; 31.0.15.5222) - AMD Ryzen 7 2700X Eight-Core Processor (16 Threads)

Issue description

I discovered this bug in multiples scenes in a larger project where I could only bake the scene on medium quality settings, whereas Godot would crash when setting the quality of the LightmapGI to high (or increase the ray count in the project settings above a certain level). Disabling some of the static objects in the scene would allow to bake the scene on higher settings, so that also seems to have some effect on it.

The baking process gets stuck at "Integrate indirect lighting" before the percentage number appears, the progress box in the center gets stretched for several seconds before Godot crashes completely or gets unresponsive. During this, the console prints an endless amount of repeating errors:

godot_lightmapper_crash3JPG

The problem seems to be that if a scene has a certain number of lights with a light size > 0, the lightmapper will crash. I guess that this limit depends on the complexity of the scene and the hardware (on my PC, I was able to bake on High settings up until around 10 lights in the mrp project). It also doesn't seem to be an issue with GPU memory, in the mrp project the usage never goes above 1.6gb.

Steps to reproduce

  1. Open the attached project
  2. Open test-scene.tscn
  3. Select the LightmapGI node
  4. Bake the lightmaps on High or Ultra Settings
  5. (Optional) If the baking process does not fail, duplicate some of the lights (like described above the limit is probably different on different systems)

Minimal reproduction project (MRP)

LightmapGI_MRP.zip

Calinou commented 5 months ago

I can't reproduce this on 4.3.dev 4971b7189 (Linux). VRAM utilization never goes above 5 GB on a GPU with 24 GB of VRAM (yours has 8 GB), so the issue must be elsewhere.

For reference, I have OIDN 2.2.2 set up (as the MRP is configured to use OIDN).

https://github.com/godotengine/godot/assets/180032/f30ac7ad-d954-410a-907f-355f0c038913

PC specifications - **CPU:** Intel Core i9-13900K - **GPU:** NVIDIA GeForce RTX 4090 - **RAM:** 64 GB (2×32 GB DDR5-5800 C30) - **SSD:** Solidigm P44 Pro 2 TB - **OS:** Linux (Fedora 39)

Is your GPU overclocked or undervolted? Is the GPU properly seated in the PC (i.e. does it suffer from sagging issues that are prevalent on high-end GPUs due to their weight)? If you encounter GPU sag, consider installing a PCI-E support bracket, or rotate your PC case sideways so the GPU is no longer sagging.

DevLewa commented 5 months ago

I can reproduce this issue. For me it crashes if 7 or more lights are active/set to a size of > 0. VRam utilization doesn't go above 2.5 GB, but the GPU load spikes to 100% and it crashes shortly after with the same error messages shown at the OP.

Specs:

Calinou commented 5 months ago

@BernhardMt @DevLewa Could you try increasing the TDR timeout duration as described in https://helpx.adobe.com/substance-3d-painter/technical-support/technical-issues/gpu-issues/gpu-drivers-crash-with-long-computations-tdr-crash.html then baking lightmaps again?

BernhardMt commented 5 months ago

@Calinou Thanks for your help, setting the TDR timeout registry keys seems to have solved the issue, the mrp project no longer crashes and baking is completing successfully even when I double the number of lights.

clayjohn commented 2 months ago

We can fix this issue by splitting up the compute shader commands into multiple passes so that smaller chunks of work are sent to the GPU at a time. (i.e. devide the atlas into 4 regions and send 4 commands instead of one big command)