Cysharp / R3

The new future of dotnet/reactive and UniRx.
MIT License
2.27k stars 99 forks source link

Platform-dependent tests and ReactiveProperty race condition fix #230

Closed nepolak closed 4 months ago

nepolak commented 4 months ago

Fixed some tests, results of which were dependent on the platform used.

Fixed ReactiveProperty race condition

Also (seems to be) mentioned in https://github.com/Cysharp/R3/issues/229

neuecc commented 4 months ago

I understand that using a lock is the most reliable approach. However, for ReactiveProperty (and Subject), I want to avoid using locks. I'd like to know the conditions under which race conditions occur, and based on that, choose a better solution.

nepolak commented 4 months ago

I understand that using a lock is the most reliable approach. However, for ReactiveProperty (and Subject), I want to avoid using locks. I'd like to know the conditions under which race conditions occur, and based on that, choose a better solution.

That's understandable.

It may occur like this:

  1. Observer calls Subscribe on a ReactiveProperty, receives current value, but is not subscribed yet. It stays here: https://github.com/Cysharp/R3/blob/4fc537d03444c102f323abd2242852be59f1c7b6/src/R3/ReactiveProperty.cs#L213
  2. In the mean time, from another thread, OnNext is being called and a new value is written and pushed without synchronization: https://github.com/Cysharp/R3/blob/4fc537d03444c102f323abd2242852be59f1c7b6/src/R3/ReactiveProperty.cs#L116
  3. Observer is subscribed: https://github.com/Cysharp/R3/blob/4fc537d03444c102f323abd2242852be59f1c7b6/src/R3/ReactiveProperty.cs#L225

And so, although the items were pushed like this:

  1. First (e.g. from the constructor)
  2. Second

What the observer sees is:

  1. First

The 'Second' is left missing, and so observer thinks the current value is First. This is depicted in the log in this comment (third run): https://github.com/Cysharp/R3/issues/229#issue-2372532001

neuecc commented 4 months ago

I think it is overkill to lock around the entire OnNext. In dotnet/reactive, BehaviorSubject is only locked when the value is set.

If we put SubscribeCore's observer.OnNext in lock and then minimize the lock, get this.

public void OnNext(T value)
{
    ObserverNode? node;
    ObserverNode? last;

    OnValueChanging(ref value);
    lock (this)
    {
        this.currentValue = value;
        node = Volatile.Read(ref root);
        last = node?.Previous;
    }
    OnValueChanged(value);

    OnNextCore(value, node, last);
}

void OnNextCore(T value, ObserverNode? node, ObserverNode? last)
{
    ThrowIfDisposed();
    if (IsCompleted) return;

    while (node != null)
    {
        node.Observer.OnNext(value);
        if (node == last) return;
        node = node.Next;
    }
}

However, as a worst-case scenario, it will break down if last is deleted by Dispose while OnNextCore is running.

If we cannot tolerate that, we can divide the code into two systems: BehaviorSubject, which is completely thread-safe, and ReactiveProperty, which is not thread-safe for Subscribe/OnNext. and the observer-store of BehaviorSubject should be modified to be a FreeListCore like Subject, not a linked list.

ptasev commented 4 months ago

A thread-safe version of [ReadOnly]ReactiveProperty would be useful for my team. We're particularly interested in the safety of Value and Subscribe.

Our particular use case doesn't care about missing a value so much. We care about not having data tearing in value, or breaking the subscriber list.

neuecc commented 4 months ago

As explained in the Concurrency Policy section, the OnNext of Subject is not thread-safe for subsequent operators. This is also true for dotnet/reactive, and ReactiveProperty's Value is essentially the same as OnNext. https://github.com/Cysharp/R3?tab=readme-ov-file#concurrency-policy

Therefore, when updating Value in a multi-threaded environment, it must first be enclosed in a lock. Furthermore, for ReactiveProperty, when considering thread safety, Subscribe also requires a lock. We want to proceed with this policy. As a result, we will not change the current implementation of ReactiveProperty.

neuecc commented 4 months ago

I've updated ReadMe about this policy. https://github.com/Cysharp/R3/commit/2968aa595d55a1cec657ab9d951530ada9ae158a

ninjaoxygen commented 4 months ago

@neucec I think this still leaves a bug in Switch then? Or is the underlying cause of that a race between the Subscribe and the first OnNext... so I would need to derive a class from ReactiveProperty and add the locks there?

Would it be possible to add the fully-locked version as BehaviourSubject and note in the README that it comes at a significant cost?

neuecc commented 4 months ago

Yes, I thought about it a lot, but it is hard and difficult to lock everything! The policy is unchanged, but we have added a SynchronizedReactiveProperty in v.1.1.15 This is the thread-safe ReactiveProperty you request, where everything is locked. We believe that if you change it to this, it will work fine. Thanks!

ptasev commented 4 months ago

I'm curious, does Value {get;} need a lock to prevent data tearing when T is a large struct for example? Or at least a Volatile.Read?

neuecc commented 4 months ago

Okay, I think it is better to implement this conservatively here, so I will also add lock in get.