Resolving bindables via dependency injection is unsafe

frenzibyte commented 1 year ago

Dependency injection happens on the load thread, and resolving bindables require creating a bound copy of the cached bindable. When a drawable is asynchronously loaded, there's a chance for the resolved bindable to be desynchronised from the remote bindable, if the remote bindable changes as the local bindable is being bound.

This is the direct reason for the ArgumentOutOfRangeException test failures that've been recurring in osu!. Here's brief logging of what's going on:

[runtime] 2023-01-22 21:51:14 [verbose]: [ThreadedTaskScheduler (LoadComponentsAsync (standard)), 39] Creating new playlist item bindable list with number 6720
[runtime] 2023-01-22 21:51:14 [verbose]: [Update (GameThread), 19] BindableList<PlaylistItem>(6693): adding 1 item(s) to list
[runtime] 2023-01-22 21:51:14 [verbose]: [ThreadedTaskScheduler (LoadComponentsAsync (standard)), 39] Binding bindable instance 6720 to 6693
[runtime] 2023-01-22 21:51:14 [verbose]: [ThreadedTaskScheduler (LoadComponentsAsync (standard)), 39] Parsing content of bindable instance 6693 to 6720
[runtime] 2023-01-22 21:51:14 [verbose]: [ThreadedTaskScheduler (LoadComponentsAsync (standard)), 39] Bindables 6720 and 6693 have equal content (0, 0), nothing changed
...
[runtime] 2023-01-22 21:51:14 [verbose]: [Update (GameThread), 19] BindableList<PlaylistItem>(6718): removing item with index 0, item: osu.Game.Online.Rooms.PlaylistItem, caller is 6693
[runtime] 2023-01-22 21:51:14 [verbose]: [Update (GameThread), 19] BindableList<PlaylistItem>(6719): removing item with index 0, item: osu.Game.Online.Rooms.PlaylistItem, caller is 6693
[runtime] 2023-01-22 21:51:14 [verbose]: [Update (GameThread), 19] BindableList<PlaylistItem>(6720): removing item with index 0, item: doesn't exist, caller is 6693, stack trace
[runtime] 2023-01-22 21:51:14 [error]: An unhandled error has occurred.
[runtime] 2023-01-22 21:51:14 [error]: ---> System.ArgumentOutOfRangeException: Index was out of range. Must be non-negative and less than the size of the collection. (Parameter 'index')
[runtime] 2023-01-22 21:51:14 [error]: at osu.Framework.Bindables.BindableList`1.removeAt(Int32 index, BindableList`1 caller)
[runtime] 2023-01-22 21:51:14 [error]: at osu.Framework.Bindables.BindableList`1.removeAt(Int32 index, BindableList`1 caller)
[runtime] 2023-01-22 21:51:14 [error]: at osu.Framework.Bindables.BindableList`1.removeAt(Int32 index, BindableList`1 caller)

As can be seen above, one of the bindables (6720) failed to copy from a remote bindable (6693), since the remote bindable was being changed in a separate thread, and the local bindable failed to see the change.

I'm seeing multiple ways through fixing this, and I need more opinions on the matter to come to an amicable solution here:

Delay binding to remote bindables until LoadComplete (this follows a rule of thumb we use on bindables in general, which is binding and listening to value changes in LoadComplete / update thread)
Add thread safety to binding bindables via mutex lock (skeptical of this way as it tends to increase the chances of deadlocks, but need voices)

cc/ @ppy/team-client

bdach commented 1 year ago

Both proposals are pretty scary and require extensive testing. Delaying to LoadComplete() seems much safer to me than mutexes though. Especially that I am worried of binding loops (we frequently have those happen, and a lot of the time the only reason they don't cause stack overflows is because of the logic that doesn't propagate value changes on set unless the value has actually meaningfully changed).

frenzibyte commented 1 year ago

I could make the first proposal less scary by performing an unbound copy in the regular DI path, but delay the actual binding process until LoadComplete.

bdach commented 1 year ago

Doesn't do much for me to be honest. The scariness of the change is in its sweeping scope and unknowable potential consequences, rather than the details of how it's done.

frenzibyte commented 1 year ago

Preliminary work on the former direction can be viewed here. Not in a runnable state yet.

smoogipoo commented 1 year ago

We used to subconsciously be very aware of bindables' shortcomings in BDL as this issue has bitten us many times in the past, so the former is acceptable. However, I don't know if it's something that should be special cased in BDL like the branch above, since it can have other more hidden side effects such as potentially loading a lot of data in the first frame.

Just to be clear, what you'd like in the second solution is for a lock to be placed around BindTo(), correct? On paper I don't see why that wouldn't work, since we already use LockedWeakList and I imagine it would just be a matter of moving that lock up one level.

frenzibyte commented 1 year ago

The former branch would populate an unbound copy of the bindables to the drawable at BDL, in case the drawable needs to fetch the bindables' states at BDL.

As for locking, I'd have to think of a way for the following to work correctly:

thread #1: BindableX changes
thread #2: BindableY.BindTo(BindableX)

Not sure how sane it is to enter lock using the target bindable (i.e. lock (other.bindings) in addition to the local's own bindings list)

EVAST9919 commented 1 year ago

ppy / osu-framework

Resolving bindables via dependency injection is unsafe #5635