Pauan / rust-signals

Zero-cost functional reactive Signals for Rust
MIT License
675 stars 37 forks source link

async_std::task::yield_now() not working as expected #63

Closed Qix- closed 1 year ago

Qix- commented 1 year ago

The following works as expected.

use async_std::task;
use futures_signals::signal::{SignalExt, Mutable};

macro_rules! yield_now {
    () => {
        task::sleep(std::time::Duration::from_millis(1)).await;
    }
}

#[async_std::main]
async fn main() -> std::io::Result<()> {
    let my_state = Mutable::new(5);

    task::spawn(my_state.signal().for_each(|value| {
        // This code is run for the current value of my_state,
        // and also every time my_state changes
        println!("new value: {}", value);
        async {}
    }));

    println!("updating");
    *my_state.lock_mut() = 10;
    yield_now!();
    *my_state.lock_mut() = 15;
    yield_now!();
    *my_state.lock_mut() = 20;
    yield_now!();

    Ok(())
}
updating
new value: 5
new value: 10
new value: 15
new value: 2

But changing the yield_now macro to the follow does not:

macro_rules! yield_now {
    () => {
        task::yield_now().await;
    }
}
updating
new value: 5
new value: 20

Shouldn't signals be updated at each event loop tick? Perhaps I'm misunderstanding yield_now()'s purpose... but I don't think I am. Is there really a timing element to rust-signals?

Just trying to figure out how the semantics work at the basic level. Any info is appreciated :)

Pauan commented 1 year ago

Timing and scheduling and yielding depends on the semantics of the executor, futures-signals has no control over that. Each executor (async-std, tokio, wasm-bindgen-futures, etc) has its own way of scheduling tasks.

The order that tasks are scheduled to run is arbitrary and random, you cannot rely on it. Tasks can even run on multiple threads, which means that multiple tasks can be running at the same time. If you need to guarantee a specific order, then you must use a synchronization system like semaphores, locks, channels, Stream, Signal, Future, etc.

Your code doesn't use any synchronization, it just implicitly assumes that the order of execution will be lock_mut -> for_each -> lock_mut -> for_each -> lock_mut -> for_each. But you can't rely upon that, tasks are asynchronous and can be spawned on a different thread.

Even if they're spawned on the same thread the order that tasks are run is an implementation detail of the executor, the executor can run tasks in whatever order it wishes. It's entirely possible for the order to be lock_mut -> lock_mut -> lock_mut -> for_each (which is what happens with yield_now).

Your task::sleep code happens to work because the for_each task happens to run faster than 1 millisecond. But that's just a happy accident, if the for_each task takes longer than 1 millisecond, then you'll see the same issue:

    task::spawn(my_state.signal().for_each(|value| {
        // This code is run for the current value of my_state,
        // and also every time my_state changes
        println!("new value: {}", value);
        async {
            task::sleep(std::time::Duration::from_millis(10)).await;
        }
    }));

This is just how async code works. Even when running on a single thread, multiple tasks are multiplexed so that they are running at the same time. So you cannot rely upon implicit timings, because unlike sync code there is no defined order for async code (unless you manually synchronize the order yourself).

Looking at the documentation for yield_now, it doesn't actually delay the task to the next event loop tick, it just schedules the current task to the end of the current event loop tick. It is intended for splitting up a CPU-intensive program into multiple time slices.

And this is confirmed by looking at the implementation, which does a synchronous wake_by_ref directly inside of poll, which means that the task will be run on the current event loop tick, not the next event loop tick. So the behavior of yield_now is correct.

As for the behavior of signals, that is also correct. As explained in the tutorial, a Signal only cares about the current value, it is similar to a mutable variable or a RefCell. It does not store any history of past values.

This means that if multiple changes happen within a small period of time, the intermediate changes will be lost. At the time that your for_each task runs, the current value is 20, so that's what it displays. The intermediate values of 10 and 15 were already discarded, because they are in the past.

It is guaranteed that you will always be given the current value of the Signal, but you cannot rely on being given intermediate values. If you need all of the values (including intermediate values) then you must use a Stream, not a Signal. An mpsc is the standard way to create a Stream (I personally use unbounded in my own code).

Unlike a Signal, a Stream will store all intermediate values, and it will store them in the correct order. So that means a Stream will always give you the same values regardless of the timing or scheduling. This is perfect for dealing with events, where you don't want to miss any events.

Qix- commented 1 year ago

Thanks for the information :) Really appreciated, answers my question!