slawlor / ractor

Rust actor framework
MIT License
1.51k stars 72 forks source link

Typed errors and events #60

Closed tbillington closed 1 year ago

tbillington commented 1 year ago

Have you looked into the possibility of more strongly typed SupervisionEvents and ActorProcessingErrs ?

For example, a supervising actor could receive a ActorTerminated with a typed response as opposed to Option<String>. That way actors can clearly signal what happened in a way more idiomatic, akin to rusts Result handling.

It feels like the exactly erlang model of actors panicing and leaving it to the supervisor to handle does not map 1:1. "Most" rust code will use ? and From some error types into others to surface errors to callers as opposed to the throw exception model. So a library/developer experience focused on panic handling as the primary error communication method doesn't allow rust to work to it's fullest. Perhaps it's me attempting to fit actors into my mental model though, and I should rethink.

My understanding is there's no type level link between two actors, one of which is supervising the other, so implementing this currently would be interesting, though the conversion between ActorCell and ActorRef feels like it could be related.

Just spit balling, but hopefully communicates the idea

use ractor::{Actor, ActorRef};

enum ChildError {
    DatabaseError(String),
    ProcessingError(String),
}

struct Parent;

#[async_trait::async_trait]
impl Actor for Parent {
    type Msg = ();
    type State = ();
    type Arguments = ();

    async fn pre_start(
        &self,
        myself: ractor::ActorRef<Self>,
        _: Self::Arguments,
    ) -> Result<Self::State, ractor::ActorProcessingErr> {
        Actor::spawn_linked(None, Child, (), myself.get_cell()).await?;
        Ok(())
    }
}

impl Supervisor<Child> for Parent {
    type StopReason = ChildError;

    async fn handle_supervisor_evt(
        &self,
        _this_actor: ActorRef<Self>,
        message: SupervisionEvent<Self::StopReason>,
        _state: &mut Self::State,
    ) -> Result<(), ActorProcessingErr> {
        match message {
            SupervisionEvent::ActorTerminated(_, _, reason) => {
                match reason {
                    ChildError::DatabaseError(msg) => {
                        warn!("child experience database error");
                        // restart
                    }
                    ChildError::ProcessingError(msg) => {
                        error!("child experiences processing error");
                        // don't restart
                    }
                }
            }
            _ => {
                // usual Supervision events
            }
        }
    }
}

struct Child;

#[async_trait::async_trait]
impl Actor for Child {
    type Msg = ();
    type State = ();
    type Arguments = ();

    async fn pre_start(
        &self,
        _myself: ractor::ActorRef<Self>,
        _args: Self::Arguments,
    ) -> Result<Self::State, ractor::ActorProcessingErr> {
        Ok(())
    }

    async fn handle(
        &self,
        myself: ActorRef<Self>,
        _message: Self::Msg,
        state: &mut Self::State,
    ) -> Result<(), ractor::ActorProcessingErr> {
        // alternative to manually calling stop would be returning Result<(), ChildError> and implementing From<DbError> and From<ProcessingError> so we could just `.await?` instead
        if let Err(msg) = database_work(state).await {
            myself.stop(ChildError::DatabaseError(msg));
            return Ok(());
        }

        if let Err(msg) = processing_work(state).await {
            myself.stop(ChildError::ProcessingError(msg));
        }

        Ok(())
    }
}
slawlor commented 1 year ago

Yeah I originally thought of how might one separate the supervision aspect from the core actor model, but I haven't found a great way to handle it as of yet. I'm not opposed to a breaking change to this effect, but it will require a decent amount of work, namely

  1. If the actor doesn't implement the Supervisor trait, we can't spawn_link. Or we enable the unstable specialization feature and have to use nightly rust, then all actors implement a default version which kills and bubbles-up, and can override it with more specific behaviour.
  2. Or maybe we don't do (1), but add a strong-type onto the SupervisionEvent with like you suggest, an error type which them means we either have to (a) Box it into an Any in order to use it inside of ActorCell like we do with messages or (b) add a strong type everywhere and somehow remove supervision ports from the ActorCell? I feel like (a) is the only right option down this path if we want to go this route.

In general, I agree it would be great to get more specific information out of the error bubbled to the supervisor, however at least in Erlang, there's very limited handling that don't assume any knowledge of the underlying error event so in that spirit we're compatible. At least in this way the supervisor can get a string message from it. Error handling when tearing down an actor, while cheap, isn't free. We want actor's to be generally long-lived, so for example handling a request that's fallible should have that logic handled by the actor (i.e. all results are handled and a reply sent if necessary rather than the actor dying and being restarted).

Restarts are really for

  1. Whoops we forgot to handle this error kind
  2. Really if a piece of code fails, we can't recover and maybe have bad state so we need to start fresh

Handling "clean" errors with proper types that are expected should be done in RPC reply types imo, or explicitly killing the actor knowing the supervision strategy. All that being said, I'd happily accept any POC that finds a better way! Just want to say "we've thought about this yes" lol