godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
88.88k stars 20.15k forks source link

Calling wait_to_finish() on a thread causes lag #48108

Open Matti1903 opened 3 years ago

Matti1903 commented 3 years ago

Godot version: 3.3 stable

OS/device including version: Windows

Issue description: When i call wait_to_finish() on a thread it causes lag. this problem only occurs in version 3.3, what makes threading in this version unusable for me.

Steps to reproduce: You can reproduce this with a simple test project, just watch the frame drop. If you reproduce this problem, the FPS drop may not look like much, but the lag is still immense.

var thread

func _ready():
    thread = Thread.new()

func _process(delta):
    if Input.is_action_just_pressed("ui_accept"):
        thread.start(self, "thread_function")
    print(Engine.get_frames_per_second())

func thread_function(args):
    print("Thread running")
    thread.call_deferred("wait_to_finish")

Minimal reproduction project: Thread Test Project.zip

Xrayez commented 3 years ago

Tested the project, I don't experience the frame drop on pressing ui_accept (at most drops down to 59 FPS).

It may be that the issue you experience is related to thread modernizations done in 3.3: #45618.

Can you specify which Windows version do you use exactly which may be different? Also hardware specifications...

Matti1903 commented 3 years ago

Sure, I'm using Windows 10 build 19042.928

Xrayez commented 3 years ago

Oki, I'm on the same version. I've tested a few times more and yeah it does lag on starting a thread, but that happens like in 1/20 cases, and as far as I know, it was also a problem in previous versions of Godot prior to 3.3 (which is still a bug, though).

You say that:

If you reproduce this problem, the FPS drop may not look like much, but the lag is still immense.

Since there's no visual feedback, I've also modified the test project to add at least a single moving object:

Minimal reproduction project: Thread.zip

Matti1903 commented 3 years ago

I forgot to include the visual feedback in the test project. I just tested it with the modified project you uploaded.

This is the first test in version 3.2.3 (it almost never happens, only once at the start) https://user-images.githubusercontent.com/82829772/115781990-66922e00-a3bb-11eb-810d-5da3ab1d3489.mp4

This is the project in version 3.3, here it happens almost every time I hit enter. https://user-images.githubusercontent.com/82829772/115782265-c25cb700-a3bb-11eb-92d0-4416398b9190.mp4

RandomShaper commented 3 years ago

Before the threading rewrite, Windows threading was done in a way that the OS was doing thread pooling under the hood. That's the reason why creating a GDScript Thread object was faster most of the time (Windows would only have to create the thread once, which is the costly operation, and then reuse it to submit the thread function as a work item for it next times).

Whether it may be good or not that we were relying on OS auto-pooling is not I'll judge here. If we believe that having such a layer of magic in Godot is good, we should implement auto-pooling in the core of the engine so it does the same on every platform, or provide a thread pooling API that users can just use, keeping the option to manage threads by themselves.

All that said, creation of threads is in general not guaranteed to be fast; for that reason, they shouldn't be created at runtime. While it's true that the usage shown in these MRPs is harmed by the new implementation –which has to pay the high cost of actual thread creation every time–, the right approach from the scripting point of view is implementing some mechanism for pooling or threaded work queue to stop relying on the implementation details (again, unless we really add to Godot some higher level threading APIs that can provide speed guarantees).

In GDScript it'd be better to do something like this (this work queue can only hold a single "work item", but is enough to get the idea):

extends Node2D

var thread = Thread.new()
var sem = Semaphore.new()
var keep_working = true
var work_data

var i = 0

func _ready():
    thread.start(self, "thread_function")
    $Tween.interpolate_property($Godot, "position:x", null, 512, 5)
    $Tween.start()

func _exit_tree():
    keep_working = false
    sem.post()
    thread.wait_to_finish()

func _process(delta):
    if Input.is_action_just_pressed("ui_accept"):
        work_data = i
        i += 1
        sem.post()

func thread_function(args):
    sem.wait()
    while keep_working:
        print("Work running (%d)" % work_data)
        sem.wait()
Xrayez commented 3 years ago

Whether it may be good or not that we were relying on OS auto-pooling is not I'll judge here. If we believe that having such a layer of magic in Godot is good, we should implement auto-pooling in the core of the engine so it does the same on every platform, or provide a thread pooling API that users can just use, keeping the option to manage threads by themselves.

There are existing built-in classes such as NoiseTexture which do spawn threads on each instancing, as concerned by Zylann in https://github.com/godotengine/godot/issues/26520#issuecomment-469032819, so yeah I'd kind of expect this to be handled by Godot in a more robust way, especially since Godot is currently mainly targeted at beginners.

Of course, the NoiseTexture generation in thread could also be fixed I guess, but again I think it shouldn't be difficult for beginners to use the class without reverting to threaded loading etc.

So again, depends on what kind of userbase to target. If not, all of this should be thoroughly documented, or write those threading techniques at the Godot demos repository.

RandomShaper commented 3 years ago

I think it's good that Godot can target from beginners to advanced users. However, in order to be as beginner-friendly as possible, I agree that the documentation of Thread, etc. could be improved to make it clear that threading has a learning curve and is not needed for most projects.

Regarding NoiseTexture, I also agree that users shouldn't need to be aware of its threading internals; at most, knowing that it's generated asynchronously. In that case the engine should have its own under-the-hood wise threading use, much like it does in other areas.

ndee85 commented 2 years ago

Hey everyone. I want to tune in here. I am seeing a regression in HTTPRequests and I asume it is coming from the multi threading modernization. Running the HTTPRequest.request command with use_threads gives a lag of round about 100ms(on the machine I tested on) everytime the request is executed. It does not depend on server response times or anything

Here is what I have found so far. I see a drop in performance from godot version 3.2.4-rc2 to 3.2.4-rc3. I strongly asume this is caused by this refactor: https://github.com/godotengine/godot/pull/45618

The funny thing is, this is not happing on all machines. The machine I experience the bug is a windows 10 machine(Intel(R) Xeon(R) W-2133 CPU @ 3.60GHz 3.60 GHz). Running fine on windows 10 (Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz 3.20 GHz) and also running fine on windows 11(AMD Ryzen 5800H with Radeon Graphic 3.20GHZ)

Here are two pictures from my testproject. Godot_Network_v3 2 4-rc3 Godot_Network_v3 2 4-rc2

My testproject is setup like that: Timer that executes every second a http request to a random online rest api. Then running the profiler and seeing those graphs. As said, godot 3.2.4-rc2 is working fine. Having watched through the release logs for rc3 I have found the above pull(45618) request. Since @RandomShaper has answered in this thread already, I thought it would be good to add my findings here.

Here is the testproject if someone wants to take a look.

HTTPRequestTest.zip

EDIT If you guys think, that it makes sense to open a separate issue for that, let me know.

Calinou commented 2 years ago

The threading modernization made creating threads more expensive, especially on Windows (which always had trouble with threading performance compared to POSIX operating systems).

The best way to work around this is to create a thread pool that you reuse on application startup (or anytime when having a stutter doesn't cause issues).

ndee85 commented 2 years ago

Ok, the strange thing is with HTTPRequest. It handles the threading internally and is also affected by this problem. Just made another test. Disabling use_threads actually removes the drops seen in the above picture.

ButI noticed in the past that threading increased performance of HTTPRequests. So I think we have an issue here since the HTTPThreading is not exposed in any sort.

What would be the current workaround. Disable the HTTPRequest internal threading and handle threading all by myself? Or shouldn't I use threads for HTTPRequests at all?