slawlor / ractor

Rust actor framework
MIT License
1.38k stars 69 forks source link

bug in monte_carlo example #162

Closed alex-romenskyi closed 1 year ago

alex-romenskyi commented 1 year ago

Describe the bug Monte_carlo example gives me an error like this:

    Finished dev [unoptimized + debuginfo] target(s) in 0.22s
     Running `target/debug/examples/monte_carlo`
thread 'tokio-runtime-worker' panicked at 'Failed to send message: Messaging(SendErr)', ractor/examples/monte_carlo.rs:105:14
thread 'tokio-runtime-worker' panicked at 'Failed to send message: Messaging(SendErr)', ractor/examples/monte_carlo.rs:105:14
thread 'thread 'tokio-runtime-workertokio-runtime-worker' panicked at '' panicked at 'Failed to send message: Messaging(SendErr)Failed to send message: Messaging(SendErr)', ', ractor/examples/monte_carlo.rsractor/examples/monte_carlo.rs::105105::1414

Seems like when child actor ends itself, parent actor ends as well, but there are still several child actors who tries to send messages to their parent

To Reproduce Steps to reproduce the behavior:

  1. just downloaded repo and started with

alexromensky@Alexs-MacBook-Pro ractor % cargo run --example monte_carlo

Expected behavior supposed to run without an error

Additional context MacOS 13.5.2, rustc 1.72.0 (5680fa18f 2023-08-23)

nolmelab commented 1 year ago

I had the exactly same problem on Windows 10 / Intel i5-9500 6 core machine. Some more facts are:

I suspect that there are some race condition problem somewhere.

After running with debugger, I found that UnboundedSender fails at inc_num_messages():

    pub fn send(&self, message: T) -> Result<(), SendError<T>> {
        if !self.inc_num_messages() {
            return Err(SendError(message));
        }

        self.chan.send(message);
        Ok(())
    }

The only condition that inc_num_messages() returns false is when curr & 1 is 1. I don't understand the meaning of this line. It seems that it should not happen since compare_exchange() in inc_num_messages() always add 2 to the current number.

mpsc/chan.rs contains Semaphore impl for bounded and unbounded Semaphores, and unbounded::Semaphore uses 1 to check a closed channel.

impl Semaphore for unbounded::Semaphore {
   // ...

    fn close(&self) {
        self.0.fetch_or(1, Release);
    }

    fn is_closed(&self) -> bool {
        self.0.load(Acquire) & 1 == 1
    }
}

curr & 1 == 1 means is_closed(). Hence, the error comes from cast! on a closed UnboundedSender.

Hope this helps.

slawlor commented 1 year ago

Thanks for reporting, it is a small edit which is described in #163. Should be fixed once merged.