smol-rs / async-executor

Async executor
Apache License 2.0
323 stars 43 forks source link

1.9.0 may need to be yanked #103

Closed james7132 closed 6 months ago

james7132 commented 7 months ago

https://github.com/bevyengine/bevy/issues/12044 reports that the update to 1.9.0 is SemVer compatible with the prior 1.7.x and 1.8.x releases, and is currently causing all new fresh fetches and builds of Bevy to lock up on start. Likely due to a bug with #93. Still trying to root cause the issue.

taiki-e commented 7 months ago

Thanks. I yanked 1.9.0 now.

james7132 commented 7 months ago

Thank you. I'll try to figure out why this was failing, and incorporate the findings and fixes into #102, as it does seem to fix one of the problems introduced.

zeenix commented 7 months ago

FWIW, this version also breaks zbus.

james7132 commented 7 months ago

The problem seems to be coming from the use of thread-local AtomicWakers. Upon removing them, it seems to allow for the executor to properly work, albeit it's yielding to the OS for way too long.

james7132 commented 7 months ago

I've narrowed the problem down to the fact that the executor was storing wakers in two separate locations: the Sleepers and the thread-local AtomicWakers, if a thread was awoken by the thread-local wakers, the sleeping state of the corresponding ticker was not updated.

These two different storage mechanisms for wakers likely needs to be consolidated and synchronized in some form.

james7132 commented 7 months ago

@zeenix can you check if #102 at least fixes the deadlock? This took a quick glance at zbus and it isn't using Executor::run in any way, so this localizes the problem to just how ticking works.

notgull commented 7 months ago

Maybe my instinct here was correct, as we need to keep track of the sleeper state in one central location.

james7132 commented 7 months ago

I got Bevy to work without deadlocking. We were removing the waker from Sleepers but not from the LocalQueue when dropping Ticker.

zeenix commented 7 months ago

@zeenix can you check if #102 at least fixes the deadlock?

I'll check..

This took a quick glance at zbus and it isn't using Executor::run in any way, so this localizes the problem to just how ticking works.

Actually, it is. :)

zeenix commented 7 months ago

@zeenix can you check if #102 at least fixes the deadlock?

I'll check..

At least all the zbus tests pass on both Linux and Windows with your branch.

notgull commented 6 months ago

With release v1.9.1, we've reverted the thread-local queue changes. I intend to add these back; however we have to be very careful about corner cases that are also introduced by e.g. ticking the executor instead of running it.

james7132 commented 6 months ago

Unfortunate, but thanks for addressing this.