Migrated from gitlab, originally by @bartv on Jun 27, 2016, 10:30

Reload and requires relation implies a event/action subcription. Below is std::action content:

entity Action:
    string name
end
implement Action using std::none

entity Event:
    string name
end
implement Event using std::none

entity Subscribe:
end
implement Subscribe using std::none

Action      actions [1:] -- [0:] Subscribe action_subscriptions
Event        events [1:] -- [0:] Subscribe event_subscriptions
std::Entity sources [0:] -- [0:] Subscribe source_subscriptions
std::Entity targets [0:] -- [0:] Subscribe target_subscriptions

reload = Action(name="reload")
changed = Event(name="changed")

Agent: tbd

I'll make a shot at a more complete proposal of wat event should do for us

State and Event

We currently have a Desired State Configuration model. (like puppet) The model describes the desired state i.e. the state the system should be in. The server makes a plan to reach this state and the agents enforces it.

The advantage of DSC is that there is a clear goal. It is always clear in what state the system should be. The disadvantage is that expressing complex sequences of operations is difficult. Event based configuration is much more intuitive in this regard (like ansible). However, here you don't have a clear overview of the goal.

The bridge between the two paradigms are state machines. A state machine describes (for each entity) what states it can have (not-installed, installed, configured, running,....) and how to transition from one state to the next (for non-installed -> installed, execute the installer). (This transition is an event). A DSC model can describe what the desired state is, the agent can discover what the current state is and we can derive the sequence of action required to reach the desired state.

The current state machine

Currently, there is already an implicit state machine for all resources

NotUpToDate - > UpToDate: DoUpdate

The agent manages this state machine. It compares the resource with the desired state and if the resource is not up to date with the desired state, it is changed, so that it becomes uptodate

The state machines are also dependent. The requires relation makes sure that DoUpdate is never executed before all required resources are in the state UpToDate.

For Services, the state machine is different.

NotUpToDate - > UpToDate: DoUpdate
NeedsReload -> UpToDate: DoReload

If any of the entities required by this services has reload=True, and it transitions from NotUpToDate - > UpToDate and the service itself is UpToDate then the service transitions to NeedsReload

If a new state machine concept is introduced, it should support at least these cases

Detecting the state

The state in which a resource is, can be determined by different factors:

the environment e.g. the current content of a file can change, independent of the model.
- the agent can detect this
- extern systems (monitoring,...) can detect this
current run: e.g. a config file being deployed can cause a service to reload

(more in the next post)

Proposal

What I would propose

each entity type gets an associated state machine
it is possible to extend the state machine of your parent(s)
a state transition corresponds to an event
each entity type can attach an action to each event type
relations can express state-machine-to-state-machine relations (e.g. for requires)
each entity type can define state transition based on events generated by related entities (e.g. for reload)

TODO

[ ] define the basic state machine for std::Entity
[ ] define requires and reload
[ ] define some basic resources (File, Package,....)
[ ] define some entities (that are not resource) (e.g. MySQL)
[ ] define syntax
[ ] think about interaction with snapshots and facts

Extensible State Machine

First important detail is that state machines must be extensible. Each child entity must be able to modify its parents state machine, without breaking it (Liskov wise).

I haven't really researched if there is a standard solution for this. I'll give it a shot.

The base case it easy: add a state and some transitions are the edge of the graph. Never breaks anything.

Take the following example state machine

StateMachine Basic:
Start -> Installed: Install

If I add a state running, no prob

StateMachine Service extends Basic:
Installed -> Running: Start

Adding states in between is more complicated. The easiest way would be to replace an en event, but also add it back.

e.g. If I add a state configure

StateMachine ExtendedService extends Service:
  replace Start:
     Installed -> Configured: Configure
     Configured -> Running: Start
end

We could perhaps add some type checking (or leave it to the user.).

Each Entity in the model will have a desired state, to which there is a path of states and events.

Propagation

Second important issue are the dependencies between StateMachines. A first aspect required are relation types ( #107). A relation type can carry the state machine relation

State to State relations

Requires is a state-to-state relation. It tells us that the state of the requiring object depends on the state of the required object.

For non extensible state machines, this dependency is straightforward.

implemenation requires for Requires:
 AlwaysBefore(source.Installed, target.Installed)
end

However, for the ExtendedService, it is less straightforward. If the source is ExtendedService we would expect requires to be

implemenation requires for Requires:
 AlwaysBefore(source.Running, target.Installed)
end

This is an open issue: how to keep state to state relations sane, without having to redefine them every time (that would lead to a cross product explosion),

Ideas

attach relations to the positions of the events in the type they refer to (so before and after can point to a different point. In this case, it would not help
create more types of requires (requires installed, requires running, can also explode)
add synthetic states to anchor relations (might also explode)
wildcard requires (don't start anything before the goal state is reached) (limited but powerfull)
meta states (current and required) (don't move from the current state if the required state is not reached) (same-ish as before)
relate events instead of states (won't help I think)
only allow requires type relation (don't move unless the others are ready)
wildcards with limits (any state before installed, direction depends on the goal/desired state)

Event to State relations

An event can also trigger a state change (e.g. reload).

We cloud also make this into: an event can also trigger an action on another resource. But this might create a problem of ordering and efficiency. If a resource must be installed, then started and it gets 5 reload events, then what? Install then start and reload 5 times? Install, start and reload? or just install, start? or reload, install, start?

Limiting events to triggering state transition seems cleaner. It however introduces a new kind of state transition: one triggered by other entities.

StateMachine ReloadableService extends Basic:
 Running, requires.Install -> Reload  //Install event propagated over the requires relation
 Reload -> Running: Restart

or we could even add an expression

StateMachine ReloadableService extends Basic:
 Running, requires.Install when requires.target.reload -> Reload  //Install event propagated over the requires relation
 Reload -> Running: Restart

We assume that state changes and events from the same source propagate together. And that event triggered state transitions are executed before generating events

In this case, when all required entities are ready, we will have all events, if we are in the running state, we fall back to reloading and reload. Of we are in another state, the events have no effect. So, for this case, this would work.

When a dependent entity would change state due to external influence, and this state transition is detected, a new install event is generated. This event is propagated and reload is triggered, as desired.

Entity Vs Resource

Resources are entities that can be deployed. (File, Service) Higher order entities (httpd, drupal,...) consist of multiple resources.

State machines for Resources are usually quite straightforward.

But for entities, this can be more complex. e.g. Mysql

It consists of packages, services, files,.... Will it have a simple state machine (start -> done) that fully depends on its constituents (we are done when the service is done). Or a more complex lifecycle. (started, installed, ....)

At this point, I would opt for simple, slaved to it children via relations. This requires least work and has no immediate disadvantage (first make it work, then make it better).

so: for now: only resource do actions, entities have their states bound to their children.

Exposing State

It is possible to use state to select implementations in the model. This would be interesting, but hard to execute. So we'll leave that out for now

This system should also have the ability to trigger actions/code that is managed by the server or an other process. For example:

Creating users in keystone (openstack). These users should get an initial password and these users should receive an email about it. Once the user is created, the password attribute should not be handled any more by the handler, nor should it send emails.

Hadn't thought of that.

I think it would be something like

NonExistant -> UpToDate: Create
NotUpToDate -> UpToDate: Converge

Create.actions = [mail, create]
Converge.actions = converge

Detection of state would be:

it doesn't exists => NonExistant
it does and the managed attributes (if any) are not as desired => NotUpToDate
otherwise => UpToDate

create would create the resource and set the password converge would set other managed attributed, but not the password

This system should make handler implementations easier. Instead of a do_change, blocks of code would be required for typical CRUD methods: Create, Read (determine state), Update, Delete (purge)

final notes on how to execute the proposed model

Current execution mechanism

Compile. After compilation, all entities are there, all relations are present.
Export - serialization. All entities that can have a handler (i.e. resources) are identified. They are serialized to the resource format. In this form, every resources contains all information required to deploy it. i.e. which agent it belongs to and all required attributes. Every resource is given a unique identifier.
Export - requirements resolution. For all resources, the identifiers of the required entities are collected. If any of the required entities are not resources, we follow the requires relations until we find some resources. If the non-resource-entity has no requires relation, we produce a warning.
Deploy - server part: resources are grouped per agent and sent to the agents.
Deploy - agent part: the agents deploy their resources, taking into account requires and reload semantics. If a resource requires a resource on another agent, this produces an error ( #2 )

Execution

Compile. As before, but now we also have all state machines fully resolved and all relation types in place and we know the desired state of each state machine.
Export - serialization. Same as before.
Export - state machine reduction.(optional) All state machines are reduced: transitions leading away from the desired state are not required. (Except of course for those that are triggered by external events)
Export - pub-sub. We can now derive, for each resource, which events it has to be notified of. (Because we know all relations and relation types, we can completely resolve this). We can also attach to each resource, which events it must transmit to who. (reverse relation)
Deploy. - server part: Resources are grouped per agent and sent to the agents
Deploy - agent part: The agents deploy their resources, taking into account the state machines. For all events, required by other agents, they can transmit the event.

Notes

For coordination among agents, we could use the server as event broker. Alternatively, we could use mgmt config which can just eat up the state machine and take care of everything.
We should also consider adding debugging info to the state machines.Could be of the form, Z waits for X because Y (/model/x/y.cf:123)
Is it sufficiently usable?
In the current proposal, the detection of the current state depends on the current model (i.e. whether a resource is up to date depends on its attributes) as such, the model should not depend on the current state, or evaluation becomes a crazy mess. If we want to expose state to the model, we need another kind of state, which doesn't depend on the current model. We already have this, in the form of facts, which depend on the latest previous version of the model.
In some cases, the current state can not be determined (pure side effects, like payments) but must be remembered. This is also a different kind of state. (imho)

Ok, clean slate, simpler proposal

Two States only

We currently have two states, and we can chain state machines together via requires. So basically, we can build any state machine we want, from small 2 state segments.

We also have 1 event, but we don't expose it yet.

So, if we just build a mechanism to expose this event we are done, no new syntax, everything dandy.

Slightly more then two states

Another possibility is to create a few more states, to have more meaningful events. (For sending mail and stuff)

                                           Delete                          Delete
                                  +-------------------------------+------------------------------------+
                                  |                               |                                    |
                                  |                               |                                    |
+---------+   Create       +------+-------+   Update       +------+-------+   Delete           +-------v------+
|   New   +----------------> Unconfigured +---------------->  Configured  +-------------------->    Purged    |
+----^-+--+                +------+-------+                +------+-------+                    +----+--^------+
     | |                          |                               |                                 |  |
     | |                          |                               |                                 |  |
     | |                          |                               |                                 |  |
     | |                          |               None            |                                 |  |
     +-------------------------------------------------------------------------------------------------+
       |                          |                               |                                 |
       |                          |                               |                                 |
       |                          |                               |                                 |
       |    Skip                  |        +-------------+        |             Skip                |
       +--------------------------+-------->   Unknown   <--------+---------------------------------+
                                           +-------------+

or in text

New -> UnConfigured: Create
UnConfigured -> Configured: Update
Configured -> Purged: Delete
UnConfigured -> Purged: Delete
New <-> Purged: None
Any -> Unknown: Skip

So, a few more states, a few more events.

Configured and purged are terminal states, so the machine moves towards either of them. When they are reached, the deps are satisfied.
Unknown is the catch-all error state.
Actions can be attached to Events, grouping of actions is determined in their specification.

A designer knows he has achieved perfection not when there is nothing left to add, but when there is nothing left to take away. Antoine de Saint-Exupery

In Code, this would be

entity Action:
    """ 
        An action is executed by the handler of the resource it is attached to
    """
    string name
end
Action.owner [1] -- std::Entity.actions [0:]

entity Event:
    """
        An event is emitted by a handler
    """
    string name
end
implement Event using std::none
Event.owner [1] -- std::Entity.events [0:]

std::Entity.created [1] -- Event
std::Entity.updated [1] -- Event
std::Entity.deleted [1] -- Event
std::Entity.error [1] -- Event

implementation lifeCycle for std::Entity:
    self.created = Event("created", owner=self)
    self.updated = Event("updated", owner=self)
    self.deleted = Event("deleted", owner=self)
    self.error = Event("error", owner=self)
end

implement std::Entity using lifeCycle

entity Subscription:
    """
        A subscription attaches an action to the firing of an event by a specific resource.
    """
end
implement Subscription using std::none

Subscription.on [1:] -- Event
Subscription.do [1:] -- Action

entity Reloadable:
    """ new mechanism for reloads """
end

Reloadable.reload [1] -- Action

implementation reloadable for std::Entity:
    for req in self.requires:
        Subscription(on=req.updated, do=self.reload)        
    end
end

What is tricky

In this implementation the requires relation has double use: 1a. On resources and actions: No resource/action can do anything before all its required resources are in a terminal state (deleted or configured) 1b. On events: a event fires when any of its required events fires
To make an action execute after another action, you subscribe the second to the updated event of the previous action
Every actions belongs to a resource
If an action depends on an entity, the action is delayed until those resources are in a terminal state
by default, an action does not depend on its resource
how to express that a reload must only be done if req.requires_reload

Tradeoffs

not all Entities are Entities any more, perhaps we should introduce a new root object e.g. std::Top and have Event, Subscription, Entity, Action inherit from that, but keep std::Entity as the default parent
requires relation has double use, we could just make Event.also -- Event.any instead
this approach allows fan out events (one event triggers one or more events and/or actions) but not fan in (one action is take when a specific combination of events occurs). This is because the semantics of fan in are very hard to define 3a. one special case is a flow where one event triggers multiple actions and these branch out fan in again. This would be possible to support.

I am going to archive this issue. The current master contains a fine grained notification system that can be used to achieve the initial goal of this issue.

inmanta / inmanta-core

New event/action system #108