godotengine / godot

Godot Engine – Multi-platform 2D and 3D game engine
https://godotengine.org
MIT License
90.08k stars 21.14k forks source link

Nested coroutines crash the engine after freeing the executing node #47703

Open Paspartout opened 3 years ago

Paspartout commented 3 years ago

Godot version: 3.2.3(tag) and 3.3rc7(hash mentioned in news post), both built using asan=True

OS/device including version: Linux x64 for reproduction, but issue happened on Windows and WASM/Web build too

Issue description:

When running the minimal example project without the address sanitizer everything seems fine at first at first, but eventually the engine would crash. That's what happened in a bigger project of mine. After investigation I found out that it is caused by waiting on a coroutine to complete while also trying to free the corresponding node(see example below).

Steps to reproduce:

  1. Build one of the mentioned godot versions with asan=true.
  2. Run the minimal example project
  3. Crash with following log: https://gist.github.com/Paspartout/3411dd1f0a9c49161f49ca99f7df03cb

Minimal reproduction project:

The following script attached to any node will trigger the crash/memory corruption:

extends Node

var timer: Timer

func _ready():
    timer = Timer.new()
    add_child(timer)
    timer.start()
    coro() # switching this to coro_ok() fixes the crash
    queue_free()
    print("Incoming Crash/Memory corruption")

func wait(time: float):
    timer.wait_time = time
    timer.start()
    yield(timer, "timeout")

func coro():
    while true:
        yield(wait(1.0), "completed")
        print("Please don't free me")

func coro_ok():
    while true:
        timer.wait_time = 1.0
        timer.start()
        yield(timer, "timeout")
        print("You can free me!")

coro_crash.zip

HybridEidolon commented 2 years ago

I ported the given script to GDScript 2.0 and ran it in v4.0.dev.20211015.official [f113dc986]

extends Node

var timer: Timer

func _ready():
    timer = Timer.new()
    add_child(timer)
    timer.start()
    coro() # switching this to coro_ok() fixes the crash
    queue_free()
    print("Incoming Crash/Memory corruption")

func wait(time: float):
    timer.wait_time = time
    timer.start()
    await timer.timeout

func coro():
    while true:
        await wait(1.0)
        print("Please don't free me")

func coro_ok():
    while true:
        timer.wait_time = 1.0
        timer.start()
        await timer.timeout
        print("You can free me!")

Neither coro() nor coro_ok() will crash the engine when used.

On v3.4.rc2.official.23955fc28, it is also fixed, somehow.

Edit: oh, I should mention I ran them presumably without asan. So they may still be present, but it doesn't appear to affect the official builds here.

Capital-EX commented 2 months ago

I was talking about concurrency in Godot with a small discord group. And, the issue around freeing an objects which have pending coroutines came up. I can confirm this issue forced me to upgrade a project from Godot 3.0 to Godot 4.0 (where the issue has been fixed for await #24311). Additionally, it resulted in me abandoning a coroutine-based library and caused a major rewrite to a sizable project.

This bug is highly non-deterministic, but will always fail when ran with an address sanitizer (see #74695).

(possibly related to #71998 and #74695)