kquick / Thespian

Python Actor concurrency library
MIT License
189 stars 24 forks source link

Add ability to initialize actor object with values before starting #64

Closed sakian-old closed 4 years ago

sakian-old commented 4 years ago

I don't believe this is possible currently, but I'm wondering if it would be worthwhile to allow actors to be initialized with values before launching them.

If I want to launch an actor with some configuration I currently need to do this:

@ initializing_messages([('config_string', str)], initdone='init_done')
class TestActor(Actor):
    def init_done(self):
        print(self.config_string)

address = ActorSystem().createActor(TestActor)
ActorSystem().tell(address, "Testing!")

I would like to do something more like this:

class TestActor(Actor):
    def __init__(self, config_string):
        print(config_string)

actor = TestActor("Testing!")
address = ActorSystem().createActor(actor)

This seems to me a cleaner way of initializing in the case where the launching system wants to configure the actor, but maybe there's a way to do this already (code above fails saying TypeError: 'TestActor' object is not callable)?

kquick commented 4 years ago

Initializing in this manner is only possible if the actor is instantiated in the context of the current process and therefore the data passed to the actor constructor is resident in the memory space where the actor is created.

For Thespian, this is only true for the simpleSystemBase. For all other bases (multiproc....), the target actor is constructed in a different process, thus any arguments passed to the constructor in the latter method would need to be serialized, transferred to the target process, and then reconstituted there. Since this functionality is already provided by the message delivery system in Thespian, it is better (and easier) to utilize that method for initialization (this also conforms better to the Actor model).

Also please be aware that in multiproc... Thespian bases, the Actor creation is asynchronous. The address returned from the createActor() call in your first code segment may reference an Actor that is not running yet at the time that the createActor() call returns. The subsequent tell() will buffer the message until the target Actor has been created and the message can be delivered. This is handled by Thespian itself so nothing in your first example is incorrect, but it is something to keep in mind when working in a distributed concurrency model like the Actor model: if you needed to confirm that the TestActor had run and processed your message, you would probably want to use ask() instead of tell() and have TestActor send a completion confirmation message back to the sender.

sakian-old commented 4 years ago

Ok, that makes a lot of sense, thanks for the reply!

sakian-old commented 4 years ago

Just wanted to mention a scenario that I ran into today that is connected to my question here. I'm trying to set up a system of actors using dependency injection. For instance, I have three actors:

  1. A top-level controller actor (controller_actor)
  2. A temperature reading actor (temperature_actor)
  3. A fan controlling actor (fan_actor)

I would like the controller_actor to be able to launch the other two actors and be responsible for passing the temperature readings from the temperature_actor to the fan_actor.

Both the temperature_actor and fan_actor contain driver objects to do their work, but I want to be able to change which implementation of these drivers are used (i.e. simulation vs real-world). In other actor system I have used I would create the actor objects in my application launcher, pass them their driver implementation, and then pass the actor objects to the controller actor for launching. Doing it this way allows me to construct an actor architecture within the application launcher without the actors needing to know the implementation details of other actors (i.e. controller_actor only know it has an actor that feeds it temperature messages, which it must pass to a fan actor).

Using Thespian I can't quite do this. In my example, I would need to create a chain of initialization messages, where I send the driver implementation details first to the controller_actor, which would then forward them on to the proper actors. If I had a deeper hierarchy this method wouldn't be practical.

Another possible solution (that I'll probably go with) is to have the application launcher start and initialize all the actors, and then pass the actor addressed to the proper parent actors to build up the hierarchy that way. This is also not ideal since each parent will then need to send an additional initialization message to its children to provide them the parent's address (or maybe I can do this in the launcher as well).

Anyway, thought I'd put this here just to see if there's another way of thinking about this that I'm missing. Thanks for the great module though!

kquick commented 4 years ago

Doing it this way allows me to construct an actor architecture within the application launcher without the actors needing to know the implementation details of other actors (i.e. controller_actor only know it has an actor that feeds it temperature messages, which it must pass to a fan actor).

From a high-level perspective, this is a very appropriate design approach: each Actor should only know of other Actors as agents which receive and optionally respond to messages and the internal implementation of the second Actor should be opaque to the first.

Another possible solution (that I'll probably go with) is to have the application launcher start and initialize all the actors, and then pass the actor addressed to the proper parent actors to build up the hierarchy that way. This is also not ideal since each parent will then need to send an additional initialization message to its children to provide them the parent's address (or maybe I can do this in the launcher as well).

There are a couple of other approaches (which I'll describe below) but in general the message-passing communications mechanism is part of the isolation benefits of the Actor model; using a different communications mechanism like passing local memory via a function call is a different model and doesn't fit well with the Actor paradigm in general. It's definitely very tempting to think of Actors as objects instantiated from class definitions, but I've found that looking at them through this lens always ends up running into problems. Passing messages can be awkward at times, as you've noted, however I've also found that handling Actor communications in this way helps gain insight and clarity on the sometimes subtle aspects of a distributed parallel application.

A couple of thoughts that might help your approach:

    class RealWorld: ...
    class Simulation: ...

    @initializing_messages(["mode", str], initdone='mode_set')
    class RealWorldActor(ActorTypeDispatcher):
        def mode_set(self):
            self.impl = (RealWorld if self.mode_set == 'RealWorld' else Simulation)()
        def receiveMsg_something(self, msg, sender):
            self.send(sender, self.impl.do_something(msg.param1, msg.param2))

This makes the Actor wrapper around the implementation very thin.

    class RealWorldActor(ActorTypeDispatcher):
        impl = RealWorld
        def receiveMsg_something(self, msg, sender):
            self.send(sender, self.impl.do_something(msg.param1, msg.param2))

    class SimulationActor(RealWorldActor):
        impl = Simulation

Then you can instantiate one or the other more easily to pass the address to the Actors that need them.

    self.createActor('RealWorldActor')

This can be useful when you pass the name of the target Actor as a string in a message to the Actor that is performing the createActor call.

    class Manager(ActorTypeDispatcher):
        def receiveMsg_str(self, strmsg, sender):
            if not hasattr(self, 'managed'): 
                self.managed = dict()
            if strmsg not in self.managed:
                self.managed[strmsg] = self.createActor(strmsg)
            self.send(sender, self.managed[strmsg])
        def receiveMsg_ChildActorExited(self, exitmsg, sender):
            for name in self.managed:
                if self.managed[name] == exitmsg.childAddress:
                    del self.managed[name]
                    return

    class FanActor(ActorTypeDispatcher):
        def receiveMsg_Work(self, workmsg, sender):
            if not getattr(self, 'temperature_actor', None):
                mgr = self.createActor(Manager, globalName='SysManager')
                self.temperature_actor = 'pending'
                self.send(mgr, 'TemperatureActor')
                self.pending = [workmsg]
            elif self.temperature_actor == 'pending':
                self.pending.append(workmsg)
            else:
                self.impl.work(self.temperature_actor, workmsg)
        def receiveMsg_ActorAddress(self, addr, sender):
            self.temperature_actor = addr
            for each in self.pending:
                self.impl.work(self.temperature_actor, each)
            self.pending = list()
    class StartConfig(object):
        def __init__(self, temperature_actor_class: str, fan_actor_class: str):
            self.temp = temperature_actor_class
            self.fan = fan_actor_class

    class controller_actor(ActorTypeDispatcher):
        def receiveMsg_StartConfig(self, cfgmsg, sender):
            self.ta = self.createActor(cfgmsg.temp)
            self.fa = self.createActor(cfgmsg.fan)
            self.send(self.fa, self.ta)

    if __name__ == "__main__":
        asys = ActorSystem(...)
        ctlr = asys.createActor(controller_actor)
        asys.tell(ctlr, StartConfig('SimulationTemperatureActor',
                                    'SimulationFanActor'))

Hopefully some of the information above is helpful for your situation, and please feel free to respond if you'd like to discuss this further.

sakian-old commented 4 years ago

Thanks, this is very helpful. I'll think about this some more and come back if I have any followup questions.