godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.63k stars 21.1k forks source link

Enough calls to lightmap_unwrap will eventually cause the game to hang. #92119

Open DataPlusProgram opened 5 months ago

DataPlusProgram commented 5 months ago

Tested versions

4.0, 4.2, 4.3dev

System information

Tested on a laptop with integrated graphics and desktop with dedicated.

Issue description

If lightmap_unwrap is called enough times the application will hang.

The number of calls before the hang can vary even on the same machine, my PC has gotten close to 10,000 calls before a hang while my laptop usually gets to around 2000.

Steps to reproduce

This is the code that will lead to a hang: image

Minimal reproduction project (MRP)

unwrapTest.zip

AThousandShips commented 5 months ago

Does this happen if you do the calls in multiple frames? By, for example, doing await get_tree().process_frame every few calls?

DataPlusProgram commented 5 months ago

Trying this still results in the same: image

lyuma commented 5 months ago

It's a deadlock with main thread calling join() on a worker thread, and the worker thread waiting on the condition variable. Main Thread: godot_xatlas_deadlock_pt1 All 4 worker threads: godot_xatlas_deadlock_pt2

It seems that the main thread is invoking .notify_one() without holding the lock, which can lead to deadlock if the notify call occurs while the thread is not at the wait() call.

The textbook usage of Condition Variable would require the joining thread to own the lock before invoking notify_one()

lyuma commented 5 months ago

The crazy thing is if true, you've just discovered a deadlock in an extremely widely used library: https://github.com/jpcy/xatlas/blob/master/source/xatlas/xatlas.cpp#L3157

One guess is other applications don't teardown the whole TaskScheduler for each bake attempt, and so it is much probably less likely to trigger this deadlock in practice in other applications.