statelyai / xstate

Actor-based state management & orchestration for complex app logic.
https://stately.ai/docs
MIT License
27.25k stars 1.26k forks source link

[Feature] Spawning Actors #428

Closed davidkpiano closed 5 years ago

davidkpiano commented 5 years ago

ℹ️ UPDATE

Actors are now in version 4.6. Please see the documentation on actors πŸ“– to see the updated API (slightly different than this proposal).

Bug or feature request?

Feature

Description:

Statecharts describe the behavior of an individual service (or "Actor"). This means that if developers want to manage multiple invoked statechart services (Actors), they either have to:

The React + XState TodoMVC example highlights this. Although the code is idiomatic React code, the child todoMachine Actors are contained in each <Todo> component, and the parent todosMachine Actor is not aware of the existence of those Actors. It only responds to events from those Actors and keeps track of their state through its context.

This is fine, but not ideal for:

To remedy this (and better mirror actual Actor-model languages like Erlang and Akka), @johnyanarella and I propose this (tentative) API:

(Feature) Potential implementation:

This is an action creator that instantiates an Actor (essentially the same as an invoked service πŸ“–) and keeps an internal reference of it in the service:

import { Machine, spawn } from 'xstate';

const createTodoMachine = (id, message) => Machine({ id, ...todoMachine })
  .withContext({ message });

const todosMachine = Machine({
  id: 'todos',
  // ...
  on: {
    // Creates a new "todo" Actor with todoMachine behavior
    ADD_TODO: spawn((ctx, e) => createTodoMachine(e.id))
  }
});

onUpdate

This is a shorthand for something like Events.ActorTransition (up for bikeshedding) or "xstate.actorTransition" which is an event dispatched from the spawned Actors whenever their state transitions:

const todosMachine = Machine({
  id: 'todos',
  context: { todos: {} },
  // ...
  on: {
    // Creates a new "todo" Actor with todoMachine behavior
    ADD_TODO: spawn((ctx, e) => createTodoMachine(e.id, e.message))
  },
  onUpdate: {
    actions: assign({
      todos: (ctx, e) => ({ ...todos, [e.id]: e.state })
    })
  }
});

The event structure would look like:

{
  type: Events.ActorTransition, // special XState event
  id: 'todo-123', // ID of todo
  state: State({ ... }), // State instance from transition
}

state.children

This is a reference to all spawned child Actor instances. For example, if we spawned a "todo Actor", a state would look like:

todosMachine.transition('active', {
  type: 'ADD_TODO',
  id: 'todo-1',
  message: 'hello'
});
// => State {
//   value: 'active',
//   context: {
//     todos: {
//       'todo-1': State {
//         value: 'pending',
//         context: { message: 'hello' }
//         // ...
//       }
//     }
//   },
//   // ...
//   children: Map {
//     'todo-1': Interpreter { ... }
//     // ...
//   }
// }

JSON.stringify(...) will not display Interpreter instances.

  1. Invoking a proprietary "Supervisor" service for the lifetime of the machine
  2. spawn(...) is just an action object; i.e.:
    spawn('todo');
    // => {
    //   type: 'xstate.send',
    //   target: '__supervisor',
    //   event: {
    //     type: 'xstate.spawn',
    //     source: 'todo'
    //   }
    // }
  3. The "Supervisor" service will automatically subscribe (.onTransition(...)) to spawned machines, keep a reference of them, call sendParent(Events.ActorTransition, ...) on each change, and .stop() each individual child service when the parent service is stopped.
marcelklehr commented 5 years ago

Your description succinctly captures the essence of the problem and the proposed API is beautifully simple and elegant. I love it :) I'm not sure if onUpdate should better be called onChildUpdate or similar, but I don't think this is too important. Is there any way to interact with a spawned actor from the parent machine in this model?

davidkpiano commented 5 years ago

I'm not sure if onUpdate should better be called onChildUpdate or similar, but I don't think this is too important.

Totally bikesheddable. I would love to choose a name that is unambiguous and clear. And it's just syntax sugar - you can always opt out of the sugar and use the (bikesheddable) event name directly:

on: {
  [actionTypes.childUpdate]: { ... }
}

Is there any way to interact with a spawned actor from the parent machine in this model?

Yes, that's definitely an important part of the proposal:

on: {
  UPDATE_TODO: {
    actions: send((ctx, e) => ({
      type: 'UPDATE',
      message: e.message
    }, { to: (ctx, e) => e.id })
  }
}

That brings up an important point about Actor IDs. According to SCXML (and the Actor model in general), invoked service IDs are randomly generated unless they're specified, so there might need to be a lookup mechanism... something like:

send(event, {
  to: (ctx, e, { children }) => findTodo(children, e.id))
});
flowt-au commented 5 years ago

Curious about the term "bikesheddable". I think understand what you mean in the context but what is the derivation, please? Cultural reference gap here. ;-)

On 19/4/19 9:02 am, David Khourshid wrote:

I'm not sure if onUpdate should better be called onChildUpdate or
similar, but I don't think this is too important.

Totally bikesheddable. I would love to choose a name that is unambiguous and clear. And it's just syntax sugar - you can always opt out of the sugar and use the (bikesheddable) action directly:

on: { [actionTypes.childUpdate]: {... } }

Is there any way to interact with a spawned actor from the parent
machine in this model?

Yes, that's definitely an important part of the proposal:

on: { UPDATE_TODO: { actions: send((ctx,e)=> ({ type: 'UPDATE', message: e.message }, {to: (ctx,e)=> e.id }) } }

That brings up an important point about Actor IDs. According to SCXML (and the Actor model in general), invoked service IDs are randomly generated unless they're specified, so there might need to be a lookup mechanism... something like:

send(event, { to: (ctx,e, { children })=> findTodo(children,e.id)) });

β€” You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/davidkpiano/xstate/issues/428#issuecomment-484717203, or mute the thread https://github.com/notifications/unsubscribe-auth/AI7E3VT6EGHUE4MFV2BUCJTPRD4WXANCNFSM4HG6KJQA.

Murray,

davidkpiano commented 5 years ago

Sorry - "bikesheddable" (able to be bikeshed) indicates that we can change the name of something so that it's more agreeable with everyone, even though the naming change is trivial and plays no role in the functionality nor implementation.

flowt-au commented 5 years ago

Thanks. Where does the term come from, though? It is not important, it is just that I am interested in metaphors and cross cultural references

On 19/4/19 10:24 am, David Khourshid wrote:

Sorry - "bikesheddable" (able to be bikeshed https://en.wiktionary.org/wiki/bikeshedding) indicates that we can change the name of something so that it's more agreeable with everyone, even though the naming change is trivial and plays no role in the functionality nor implementation.

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidkpiano/xstate/issues/428#issuecomment-484730656, or mute the thread https://github.com/notifications/unsubscribe-auth/AI7E3VUG2JQCOUJYVDW5GELPREGKLANCNFSM4HG6KJQA.

Murray,

coodoo commented 5 years ago

It’s well explained here:

https://whatis.techtarget.com/definition/Parkinsons-law-of-triviality-bikeshedding

flowt-au commented 5 years ago

Thanks! I had never heard that term. :-)

On 19/4/19 10:48 am, Jeremy Lu wrote:

It’s well explained here:

https://whatis.techtarget.com/definition/Parkinsons-law-of-triviality-bikeshedding

β€” You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/davidkpiano/xstate/issues/428#issuecomment-484733875, or mute the thread https://github.com/notifications/unsubscribe-auth/AI7E3VVGNOGKNFWXINT3PMDPREJEPANCNFSM4HG6KJQA.

Murray,

LunarLanding commented 5 years ago

Coming from 424, this proposal does address the issues I found with keeping a dynamic list of submachines inside the xstate framework :)

In defense of having the children as argument to actions:

Say I need to have the parent "pick teams" for each child. It can do this by : getting a list of the existing childs (so the action needs the "children" argument made available), splitting them into teams and send an event with the respective choice to each child (by using the corresponding interpreter send method).

About actors vs statecharts

At this point I think statecharts are useful for coding each actor and visualizing their states in a single view, but communication between actors them starts going into the domain of actor diagrams. Which ideally, in a distant future would be part of xstate.

davidkpiano commented 5 years ago

Say I need to have the parent "pick teams" for each child

This would make a good example use-case, here's what I'm thinking:

const playerMachine = Machine({/* ... */});

const gameMachine = Machine({
  id: 'game',
  initial: 'registering',
  states: {
    registering: {
      on: {
        REGISTER: spawn(playerMachine, (_, e) => e.id),
        CLOSE_REGISTRATION: 'pickingTeams'
      }
    },
    pickingTeams: {
      onEntry: (ctx, e, { children }) => {
        children.forEach(child => child.send('ASSIGN_TEAM', { team: pickTeam(child.id) });
      }
    }
  }
});
LunarLanding commented 5 years ago

Would child ids be local to the parent interpreter?

If a a child wants to send an event/message to another child, would it be able to, assuming it has a sibling's id, send a message to it? This could be something like sendToSibling and would avoid having to wire up something like sending to the parent a send-to-child event that the parent would forward to the target child.

davidkpiano commented 5 years ago

Would child ids be local to the parent interpreter?

Theoretically yes. It would have the notion of a "reference" (or "address") which would be globally unique.

If a a child wants to send an event/message to another child, would it be able to, assuming it has a sibling's id, send a message to it?

Nothing stopping you from doing so πŸ‘. In the Actor model (Akka, Erlang, etc.) this would be done by the parent passing references of the child actors to each other. It's not recommended to communicate child <-> child because the child might not exist, whereas communicating with the parent will always exist and you'll be better able to handle situations where the child doesn't exist.

This could be something like sendToSibling and would avoid having to wire up something like sending to the parent a send-to-child event that the parent would forward to the target child.

Let's avoid magic for now πŸ˜… better to be explicit about this (partially for above reasons)

davidkpiano commented 5 years ago

(copied from Spectrum)

If we allowed a tiny bit of magic, then we could allow developers full control of the spawned Actor refs in context:

on: {
  ADD_TODO: {
    actions: assign({ todoRefs: (ctx, e) => {
      const newTodo = spawn(createTodo(e.data)); // creates new Actor ref
      return {
        ...ctx.todos,
        [newTodo.id]: newTodo
      }
    }
  },
  UPDATE_TODO: {
    // Send 'UPDATE' event to todo Actor ref
    actions: send((ctx, e) => ({ type: 'UPDATE', id: e.id }), { to: (_, e) => e.id })
  }
}

This means that instead of having to read from state.children, Actor refs would be entirely managed by the developer in state.context.

Also, developers would need to remember that they can't simply delete an Actor ref to "kill" it (it would still be alive); they'd need to send a STOP event to it:

on: {
  DELETE_TODO_PERMANENTLY: {
    // Effectively calls .stop() on child Todo Actor
    actions: stop((_, e) => e.id)
  }
}

The refs would be readable from state.context:

state.context.todoRefs;
{
  'todo-1': Actor { ... },
  'todo-2': Actor { ... },
  // ...
}

// send event externally to a Todo
// NOTE: see note below (no transition in parent component triggered)
state.context.todoRefs['todo-1'].send('TOGGLE_COMPLETE');

Note: this will NOT trigger a state transition in the parent component unless onUpdate explicitly listens for Actor changes:

onUpdate: {
  actions: assign({ todos: (ctx, e) => {
    // e.state represents the state of the invoked Actor
    return { ...ctx.todos, [e.id]: e.state.context }
  })
}

This is a feature not a bug. Just like if you were managing a chat server, you don't want all of the state of every single child connection to the chat server constantly represented on the parent server (unless you like being DDOS-ed). Instead, you would either:

This makes performance much better, as exemplified in the React TodoMVC example with this proposal:

const Todos = () => {
  const [current, send] = useMachine(todosMachine);

  const { todoRefs } = current.context;

  return (
    <ul>
      {Object.keys(todoRefs).map(todoRef =>
        <Todo key={todoRef.id} todo={todoRef} />
      )}
    </ul>
  );
}

// ...

const Todo = ({ todo }) => {
  // todo is a live Actor (service)
  const [current, send] = useService(todo);

  return (<li>
    <span>{current.context.message}</span>
    <button onClick={_ => send('COMPLETE')}>Complete</button>
  </li>);
}

The <Todos /> component will not rerender whenever any todo updates, which is a very good thing.

Edit: To ensure determinism, the state of the Actor refs will not be able to be directly read (just like normal Actors). Instead, .onTransition(...) (or onUpdate: { ... }) needs to be used. This will prevent antipatterns like this:

// ❌ non-determinate
{
  target: 'allComplete',
  cond: ctx => {
    return Object.keys(ctx.todoRefs).every(todoRef => todoRef.state.matches('completed'))
  }
}

// βœ… determinate - explicit state management!
onUpdate: {
  actions: assign({ todos: (ctx, e) => {
    // e.state represents the state of the invoked Actor
    return { ...ctx.todos, [e.id]: e.state }
  })
},

// ...
{
  target: 'allComplete',
  cond: ctx => {
    return Object.keys(ctx.todos).every(todo => todo.state.matches('completed'))
  }
}
marcelklehr commented 5 years ago

Somehow, I find this last proposal involves too much manual management for simple cases. What are the benefits of managing the children ourselves?

The component will not rerender whenever any todo updates, which is a very good thing.

I assume this is not the case with the original proposal, as the child states are part of the parent state? Still, with the last example of rescuing the state into the parent context onUpdate, wouldn't this re-render the parent component, again? I realize that this is optional, with the latest approach.

What if we give the user a choice which kind of actor management they want, effectively turning this into two features?

johnyanarella commented 5 years ago

I think I missed some context here (no pun intended). What is gained by moving the refs into the context? Is it just to allow for more convenient structuring of related spawned children refs for later consumption?

Could that instead be addressed by allowing a "group" naming strategy - as another alternative to an auto id, explicit name or prefix - for spawn() where the new ref was stored into a correspondingly named map in state.children?

spawn(createTodo(e.data), (_, e) => ({ id: e.id, group: 'todos' }))

// ...

state.children.todos['todo-1'].send('TOGGLE_COMPLETE');

Currently thinking there are better opportunities for tooling if the references are managed in a dedicated location (state.children) with consistent rules.

From an earlier discussion:

you can think of spawn as sending an event to a permanently invoked "supervisor" service

Would the parent machine still play a supervisory role (or have an internal supervisor service for these spawned actors)? Are those references moved exclusively to arbitrary representations in the context (ex. a key of some arbitrary map as above) or would they also still be internally tracked by a supervisor?

Is that parent machine (or its internal supervisor) responsible for automatically stopping spawned children when it is stopped (beyond the cases where you manually stop() as an action)? Or does moving the refs into the context make that the responsibility of the developer?

davidkpiano commented 5 years ago

I assume this is not the case with the original proposal, as the child states are part of the parent state?

This is the case with the original proposal. All updates are done via explicit onUpdate.

Still, with the last example of rescuing the state into the parent context onUpdate, wouldn't this re-render the parent component, again? I realize that this is optional, with the latest approach.

Exactly. But you have more granular control over when those updates happen (if they should happen... e.g., if you want to add debouncing, throttling, batching, etc.)

What if we give the user a choice which kind of actor management they want, effectively turning this into two features?

Hmm... definitely don't want two separate features, but I believe that the two proposals can coexist as one:

// Only adds todo ref in state.children
on: {
  ADD_TODO: spawn(createTodo, (_, e) => e.id)
}

// Adds todo ref in state.children, as well as in the context
on: {
  ADD_TODO: {
    actions: assign({
      todoRefs: (ctx, e) => ({
        ...ctx.todos,
        [e.id]: spawn(createTodo, e.id)
      })
    })
  }
}

What are the benefits of managing the children ourselves?

In Akka, Erlang, etc., Actor refs are values just like anything else. You have full control over where you store them in the parent Actor's (a.k.a. process in Erlang) context. It also makes patterns like "send this event to all Actors of this type" much easier:

// βœ…
on: {
  DELETE_ALL_TODOS: (ctx) => {
    Object.keys(ctx.todoRefs).forEach(todoRef => todoRef.send('DELETE'));
  }
}

Versus having a special children property:

// πŸ˜•
on: {
  DELETE_ALL_TODOS: (_, __, { children }) => {
    [...children.keys()].forEach(ref => {
      if (ref.type === 'todo') {
        ref.send('DELETE')
      }
    })
  }
}
johnyanarella commented 5 years ago

Ah! So, state.children remains (with actor state still isolated). The ref is just a variable value with a send() method, available as a building block.

πŸ‘

coodoo commented 5 years ago

Is there a way to handle onDone on the parent machine that spawned the actor? Something like this:

invoke: {
  id: 'secret',
  src: secretMachine,
  // Currently only onUpdate works like this
  onDone: {
    target: ...
    actions: ...
    })
  }
}

and it seemed the only way to handle onDone is to directly hook on the interpreter like this:

const service = interpret(todosMachine)
      .onDone(() => {
        done();
      })
      .start();
davidkpiano commented 5 years ago

Actors can be anything that can send/receive messages, not necessarily just machines. An Actor from a state machine signaling that it's "done" is just it sending a special "done" event, so it's better to handle an actor being "done" with an event. You can even put it in the final state:

success: {
  type: 'final',
  onEntry: sendParent('allDone')
}
ZempTime commented 5 years ago

Unsure if this would be helpful in this thread at this point in the discussion, but some google engineers used xstate to implement an actor-model-based ui:

coodoo commented 5 years ago

@dakom Yep I get that, but I'm just trying to reproduce the test case here, when I was monitoring the event I noticed there's this event coming in when actor reached it's final state:

{type: "done.invoke.WorkerMachine", data: undefined, toString: Ζ’}

but for some reasons I can't seemed to handle that done.invoke.WorkerMachine event anywhere, I tried specifying onDone on the parent machine, and onDone on the interpreter (just like your test case) to no avail, hence wondering what's the correct approach to capture that event.

ZempTime commented 5 years ago

Bikeshedding: I'm gonna throw my vote in behind onChildUpdate.

An alternative idea is , handleUpdate or handleChildUpdate, but I don't like these:

Storing refs in context Strongly agree! :+1:

My Main Concern: xstate is a truly wonderful way to model the behavior of UI components. This translates really naturally to using them everywhere, across a lot of components, fairly quickly. With spawning actors, large agglomerations of services will begin reflecting the (sometimes complicated) applications which use them. There's a tradeoff which happens when you refactor smaller services out of larger ones. The overhead of tracing an individual message path increases. Instead of a single breakpoint, it's maybe a couple breakpoints, and maybe some good old guess & check (depending on asynchronous happenings and how far the effects of a message get distributed).

Questions/Ideas/Suggestions w/r/t improving traceability:

I think I'm tracking you on the rest of the proposal, and it looks great. πŸ’―

davidkpiano commented 5 years ago

Thanks for the input, @ZempTime!

I'm gonna throw my vote in behind onChildUpdate.

After doing a lot of research on actor-based systems (still more to be done!), I strongly believe onUpdate and onChildUpdate, etc. are footguns. The reason why is because it rapidly increases the throughput of events for parent actors (every single state transition on every single child actor is now a sent event) and it goes against explicitly modeling the flow of events. This will definitely have a negative effect on performance, analysis, and logging.

The recommended approach would be that child actors should explicitly sendParent(...) events that the parent should care about. It's a natural approach that better models how actor systems behave:

// ❌ This is a footgun
{
  onUpdate: {
    // what kind of spawned actor did the update come from?
    // what does the updated state mean?
    // should the parent even care about this update?
  }
}

// βœ… Recommended approach
const childMachine = Machine({
  id: 'child',
  initial: 'active',
  states: {
    active: {
      on: {
        CHANGE: {
          actions: sendParent((ctx, e) => ({
            type: 'CHILD.CHANGED',
            data: ctx
          })
        }
      }
    }
  }
});

// In parent...
{
  on: {
    'CHILD.CHANGED': { ... }
  }
}

Would it be possible to opt-in to trace ancestry as the consequences of an event travel through a given set of services?

Yes! Events will have metadata (per SCXML) (still working out the backwards-compat API for this) that includes useful metadata such as invokeid and potentially other data, like timestamps.

For example, if a 'PING' event originated from a parent 'pinger' machine, and a 'PONG' event was subsequently received from a 'ponger' machine, an event log might look something like this:

type: 'PING'
data: ...
invokeid: 'pinger-13844'
recipientid: 'ponger-10230'
timestamp: ...
---
type: 'PONG'
data: ...
invokeid: 'ponger-10230'
recipientid: 'pinger-13844'
timestamp: ...

I know viz is in the works, is there any kind of browser extension or other tooling which might alleviate this? (Or... vscode!??)

Working on it!

ZempTime commented 5 years ago

I understand the hesitation with onUpdate/onChildUpdate - each event spawning it's own chain of events is a recipe to destroy all the scalability you're supposed to get.

As long as I can send context in the event from my spawned actors to my parents in sendParent, and don't have to manually track all references, I'm quite happy. :)

hnordt commented 5 years ago

My feedback:

πŸ’―πŸ’―πŸ’―

πŸ’―πŸ’―πŸ’―


Simple and idiomatic API. πŸ‘

pedronauck commented 5 years ago

This is awesome and really useful, I think that the API is clean and easy to understand. So, I have some question... What's happening if I want the same child instance working with two parents or more than one machine? I don't know if this can be a real use case, but this would be possible? Makes sense? πŸ€”

I'm not an expert in Actor Model, but some solution like that will be easy to manage, maybe can hurt the concept, as I told I don't know so much about it. But, is just an idea.

This will be nice if I want to make some kinda a "broadcast" between machines and since each one has your own ID, maybe would be possible. The problem, of course, it would be the need to have something internal orchestrating these messages.


I made some experiment using some kinda a "broadcast", it's very coupled with react, but it's an example... https://codesandbox.io/s/xstate-broadcast-lg6xw

davidkpiano commented 5 years ago

...if I want the same child instance working with two parents or more than one machine?

It's best to have a system of Actors as a hierarchical structure. Your parent todosMachine can have one notificationMachine child Actor and many (zero or more) todoMachine children:

Screen Shot 2019-05-24 at 5 53 13 AM

That would look like this in code:

const todosMachine = Machine({
  id: 'todos',
  context: {
    notificationRef: undefined, // don't create yet
    todos: []
  },
  initial: 'initialize',
  states: {
    initialize: {
      entry: assign({
        notificationRef: () => spawn(notificationMachine, 'notifier')
      }),
      // ...
    },
  },
  on: {
    UPDATE_TODO: {
      actions: send('NOTIFY_TODO_UPDATE', { to: 'notifier' })
    }
  }
});

You can even make a forwardTo helper function:

function forwardTo(id) {
  return send((_, e) => e, { to: id });
}

on: {
  UPDATE_TODO: {
    actions: forwardTo('notifier')
  }
}
davidkpiano commented 5 years ago

Actors are now in version 4.6 πŸŽ‰

Please see the docs πŸ“– as the API has changed from the original proposal (it's simpler).