Open Jimmio92 opened 1 year ago
Not sure whether it's a bug, or intended behaviour for threads. Maybe some of @godotengine/core devs may say better.
Pretty sure it's a bug as there's no way to use threads without a warning/error.
Also, with all the talk about Godot becoming AAA compatible (see https://godotengine.org/article/whats-missing-in-godot-for-aaa/ ).... the change to C++ standard threads is backwards from what they're wanting. Huge performance losses on Windows -- each mutex lock/unlock has to jump ring levels to the kernel and back. Critical Sections can be used as a drop in replacement and they often do not have to jump to the kernel.
Design by committee is always challenging, but we should never backpedal, imo.
The thing is that when you do await discover_done
the function will be resumed by the thread that emitted the signal. So the wait_to_finish()
call will happen on the thread, which triggers the can't wait to finish on itself
error (which means it never really waited). Such error will end the _ready()
call/resume, which then will clean up the stack, freeing the Thread
object that was created locally, thus triggering the warning (thread destroyed without call to wait_to_finish()
). It seems to me everything is working as intended.
You can solve this by emitting the signal in a deferred call, since that will be queued and then called from the main thread. The issue is that there is no emit_deferred()
(which I think would an welcome addition), so you need to do something like emit_signal.call_deferred(discover_done.get_name(), true)
(the get_name()
here just to avoid strings and typos).
For me the only "solution" to this would be adding an emit_deffered()
method to Signal
to be easier to use on this case.
Thank you for providing a way to work around my issue.. but I still feel this can be more elegantly solved by just allowing detach
to be called on the internal thread object and changing the logic on thread destruct internally as I've done in my personal fork of Godot.. but I'm totally unfamiliar with GitHub and the whole process. (Note I also renamed "wait_to_finish" to "join" to better match... everything else... but I did so without talking to anyone, so I closed the pull request -- I don't know if I even did that correctly or not)
I've been kind of upset about the whole issue -- I love Godot, but having dug into the source here and finding they threw out OS-specific, perfectly good, and more performant code in favor of black-box magic voodoo provided by the C++ standard library when Juan just started mentioning things needing fixed for AAA usage... There's some mixed signals here. Is the engine getting more performant or easier to write? You cannot have both.
While it's sad that std::mutex
is not backed by Windows' critical section objects, if you can prove actual performance issues as a result of that, please open a proposal so we re-evaluate the matter. In scripting you are not ayway expected to squeeze the processor at such a low level. Godot also has SpinLock internally, used where it matters.
Regarding the mechanics of threads (self-destruction), naming of methods, etc., you can as well open as proposal. Probably you're not alone in believing wait_to_finish() is a needlessly non-standard name.
Now, I believe @vnen's reply addresses this sufficiently as not a bug.
I'll write up a separate proposal for the Windows threading stuff; I shouldn't have mentioned it at all here just because it was related to threads... it seems to have distracted from the actual issue.
I personally feel the current Thread implementation can exhibit undefined behavior as it detaches a thread during the deconstruct. This seems fine for an unjoined thread, but a thread deconstructing from a join is already gone -- a detach at this point is likely in error. It also states you must call wait_to_finish, which is not true -- it detaches when deconstructing. These two things IMO are bugs and the reason for this report, a proposal for my changes, and (an attempt at a) pull request to at least partially fix it.
Though, I am now seeing that detached threads are technically supported as it currently stands -- if the destructor is called it detaches -- but I was under the impression it's best to call detach early, not right before/during destruct, though this may be fine and defined behavior (I'm not sure) and thus only half as big of a bug as I thought originally. Still, don't detach when not joinable even if detach doesn't get added to the script side (an if check in the destructor would solve it), though I really feel it should be up to the end-user/dev to call detach/join depending on their needs, not for detach to automagically get called behind the scenes.
Again, sorry for bringing the OS specific stuff up in this; that's not what this bug report is about -- warnings, errors, and lack of detach is all this is about.
Detaching happens to try to avoid a crash as much as possible. If the thread has not been joined, the C++ runtime is free to crash the program. That's why the user is warned they should join (wait_to_finish()
). Detaching is not done for supporting fire-and-forget thread dynamics in Godot (which, on the other hand, would require much more than that).
Godot version
4.0 master
System information
Win 10, AMD Radeon RX 590
Issue description
I have a thread to handle UPNP stuff which blocks when called.
When the thread is done, it signals. The main thread
await
s this signal, and then callswait_to_finish
-- this throws an error about not being able to join a thread from itself. This means it's being cleaned up already like expected -- yet there is also a warning saying we must callwait_to_finish
or it won't be properly cleaned up.I have solved this problem with my personal fork of Godot by changing
Thread
and everything in the engine that uses it to use normal naming conventions (wait_to_finish
becomesjoin
, like everything else is named), and providing detach. I also moved the "cleanup" code, of which there was none (why was that warning even there?), to the destructor, making the class no longer require the warning about callingwait_to_finish
, and also bringing it more in line with what most people expect out of a thread.My only concern is.... I don't know what I'm doing when it comes to GitHub -- do I make a pull request? The fact that I changed the name of a method that's in.... some 45 files if my text editor didn't lie... means I doubt this will make it in to 4.0, if ever. But there's definitely a problem here, even if my code isn't the solution.
Steps to reproduce
Create a thread; don't call
wait_to_finish
on it. Warning appears that seems needless.Create a thread and
await
a signal saying it's done. After the await, callwait_to_finish
. Note it throws an error.Minimal reproduction project
test_case_thread.zip Open the example.tscn scene and run it with F6. Note the warning about not having called
wait_to_finish
and also the error about trying to call join on itself.NOTE: I tried to keep it exactly accurate to what went wrong so it still does do a UPNP discover -- this is not part of the problem, merely there to be the same timed external event.