Proposal: States to Flags Refactor

johnsca commented 7 years ago

It has been requested that we rename "states" to "flags" to avoid confusion between set_state and status_set. It's also more accurate of a label. However, I believe this also gives us a good opportunity to move away from RelationBase and the confusing conversation metaphor.

It was noted (I believe by @galgalesh) that another way to look at what the conversation metaphor is attempting to model is that flags have a set of units associated with them. Taking this further, I think it's reasonable to say that, rather than the behavior of an interface layer being associated with a "relation" object independent from the flag, it would make more sense to say that a given flag, if set, enables a certain set of behavior. That behavior can then be tied to the flag itself. Relation flags would have some built-in behavior that allows you to add a unit to the flag and access the relation data in a more natural way.

This would make flags more of a first-order construct and remove the need for the often confusing passing of arguments into handler function, remove the need for interface layers to be a special type of layer, remove the need for the confusing conversation metaphor, and provide a consistent namespace for encapsulating flag operations.

From an implementation standpoint, I'm picturing something like this:

from charms.reactive import flag

@when('relation.mysql.database.available')
def update_database():
    mysql = flag('relation.mysql.database.available')
    render_config(db=mysql.dsn())

# can also pass flag instances to decorators to save typing
admin = flag('config.set.admin')
superuser = flag('config.set.superuser')

@when_any(admin, superuser)
def handle_admin():
    grant_admin_powers()
    if superuser.is_set():
        grand_super_powers()

The main case for adding specialized behavior to a flag would be for interfaces, and would look something like this:

from charms.reactive import Flag

class MySQLAvailable(Flag):
    name = 'relation.{requires:mysql}.database.available'

    def dsn(self):
        for unit in self.units():
            if unit.relation_json('master'):
                return dns_template.format(**unit.relation_json())

mysql_joined = flag('relation.{requires:mysql}.joined')
mysql_available = MySQLAvailable()  # or flag('relation.{requires:mysql}.database.available')

@when(mysql_joined)
def check_for_db():
    for unit in mysql_joined.units():
        if 'database' in unit.relation_json():
            mysql_available.add_unit(unit)

Flag.units() would be a helper that gets all units, de-duped, regardless of the relation. If you need to access the units by relation, something like flag.relations() would return a list of relation objects that would each have their own relation.units() method. I think setting data would have to be done per-relation as well, so the provides might look something like:

provides = flag('relation.{provides:mysql}.joined')

@when(provides)
def create_database():
    for relation in provides.relations():
        db_name = relation.service_name()
        if not db_exists(db_name):
            create_db(db_name)
        relation.set_data({'database': db_name})

Also note that joined and departing flags would be managed automatically by the framework (perhaps handled in the base layer, or perhaps it would be better to fold all states for primary Juju constructs, like leadership, config, storage, etc directly into the framework). So interface layers can depend on them instead of re-implementing the same logic around them every time, and removing their dependency on @hook. This should also open up the path for upgrades from non-reactive charms to reactive, as it would no longer depend on a specific relation hook to fire to get the system into the correct state.

I'd also like to have relation_json() and set_json() methods that would do automatic encoding, since that's something that comes up quite often and is a source for bugs. There would be a relation_data() counterpart for raw access for backwards compatability, but I'd like to start recommending JSON encoding by default.

merlijn-sebrechts commented 7 years ago

Cool, this is going in the right direction!

I think it's better to have to do a bit more boilerplate than having stuff magically done for you. The confusion me and my colleagues had with the reactive framework is that a lot happens in the background and it isn't always clear what exactly happens. The conversations framework for example is really handy but it is hard to "get" because of the magic. For that reason I think that clarity should be one of our main concerns. The reactive framework is pretty powerful already, we just need to make it easier to use and "get".

To illustrate what I mean by "magic".

@when('relation.mysql.database.available')
def update_database(mysql):
    #...

Who passes the mysql argument? Why do some states result in arguments passed and some don't? What is the order of multiple arguments? What is the object in the argument exactly?

@johnsca's proposal removes the magic. Now the developer is in control and it's clear where the flags come from and what the object is.

@when('relation.mysql.database.available')
def update_database():
    mysql = flag('relation.mysql.database.available')

Although I would change the name of the function. flag() looks a bit to much like a constructor. You're not creating a flag, you're getting an already created flag. The terminology should make this clear, for example get_flag() and set_flag().

It might be best to explicitly disallow getting flags in a handler that doesn't use the flag in its preconditions (@when). This way we can save devs from shooting in their own foot.
This will also solve a minor linting issue where pylint complains if you react to a relation state but don't use that state in the handler.

stub42 commented 7 years ago

The examples here make the common assumption that there is a single relation with a given name ('mysql'), whereas there might be several (mysql:1, mysql:2 etc.). This proposal will look very different with this addressed. If the framework is going to automatically map sets of units to the flag, it needs to use the relation id as a namespace. So maybe:

@when('relation.mysql.database.available')
def update_database():
    for relid, mysql in flag('relation.mysql.database.available').items():
        render_config(db=mysql.dsn())
        break  # Just render an arbitrary one.

And for the common case where you aren't writing backend systems or proxies and really do only have a single relation with the given name:

@when('relation.mysql.database.available')
def update_database():
    mysql = flag('relation.mysql.database.available').value  # or maybe 'one', 'any', 'first'
    rebder_config(db=mysql.dsn())

I think the above also addresses some confusion I had with the naming, where I think of flags as booleans, yet the flag() function was returning an object with state and behaviour. Attaching or annotating a flag with one or more relations is conceptually better to me than subclassing a Flag class to implement the object representing my relation.

When I did my branch making RelationBase pluggable, I was thinking that perhaps it was unfair that charms.reactive.bus.set_state was only supposed to accept a RelationBase as a value. Why not accept anything that can be serialized? By doing this, existing code using RelationBase would continue to work (or not), and new code could attach other types of relation object... in this case a Flag... or whatever else they found useful. For example, set_state('app.status', 'active') could attach a string and a @when('app.status') decorated method would get the status string as its argument automatically, instead of the current approach of having to stuff it into unitdata.kv() and retrieve it yourself. However, this proposal is doing away with parameters to handlers. Which is probably a good thing, but the change unlikely to be backwards compatible so should be explicitly called out.

ajkavanagh commented 7 years ago

@johnsca, I think that this is a very interesting approach. The name change from 'state' -> 'flag' is very sensible as 'state' is such a loaded word in the context of software systems!

Unlike @galgalesh I don't really have an objection to:

admin = flag('config.set.admin')

It doesn't look like a constructor to me; rather it's just a function call on flag with a single string parameter. On a literal reading of it, it looks like it returns a flag thing which corresponds to config.set.admin.

I am concerned about backwards/forwards compatibility and hope that we can find someway of supporting the current mechanism/template for writing interfaces and their usage. Otherwise we're forcing a change on currently working interfaces.

@stub42 makes an interesting point about multiple units. If we can add a declaration (like a scope, but call it something else!) that says either "this is 0 or 1 units" or "this is 0 or more units" then the framework can issue a warning if more than one unit, plus disabling of the multiple/single methods on a Flag object, so that unit/functional tests fail early.

stub42 commented 7 years ago

On multiple relations using the same name, I just recalled that you can actually declare a 'limit' on the relation in metadata.yaml. Juju has has always ignored it, but it might be useful to charms.reactive. See https://jujucharms.com/docs/stable/authors-relations.

simonklb commented 7 years ago

Something that I've recently encountered while handling multiple relations of the same relation type is that there is currently no easy way to distinguish between any units being in a state from all units being in that state. I wonder if this could be solved by making relations be made up of multiple flags, one flag per relation. Then it should be possible to do something similar to this:

cluster_connected = relation("cluster.connected")
cluster_available = relation("cluster.available")

@when_not(cluster_connected)
def cluster_missing():
    # Waiting for cluster relation

@when_any(cluster_connected)
@when_not_all(cluster_available)
def cluster_not_ready():
    # Waiting for all cluster members become available

@when(cluster_available)
def cluster_ready():
    # READY!

ajkavanagh commented 7 years ago

@simonklb this is something that you would do in the interface. i.e. if the '.available' state is only 'available' when all of the cluster members are available then the '.available' state should only be set by the interface when a 'cluster' is available. Does that make sense?

simonklb commented 7 years ago

@ajkavanagh I've come to understand that states set in interfaces are based on scope. So if the interface is unit-scoped the states are set on a unit-by-unit basis and not globally (as in "all units are available"). I might be mistaken though.

ajkavanagh commented 7 years ago

@simonklb States are definitely only set at a unit level, but they are not related to scopes in any way. States are never shared between units. i.e. you can't set a state in unit-0 that can be seen by unit-1.

Juju is not aware of states either; they are purely a charms.reactive concept to help with writing a charm. Hope this helps?

simonklb commented 7 years ago

@ajkavanagh Here I'm talking about the related units (i.e. conversations in the current interface implementation) and not the state of the unit. Sorry for the misunderstanding.

So, currently, a state being set in an interface with a unit scope will include multiple "conversations". Basically indicating which unit-relation is in a certain state and which aren't.

Unit A can be related to unit B and unit C. Conversation A-B can be in state connected and available while conversation A-C can be in state connected only.

Am I mistaken here?

ajkavanagh commented 7 years ago

@simonklb Ah, yes, this is a source of confusion.

So, currently, a state being set in an interface with a unit scope will include multiple "conversations". Basically indicating which unit-relation is in a certain state and which aren't.

Unit A can be related to unit B and unit C. Conversation A-B can be in state connected and available while conversation A-C can be in state connected only.

Whilst, conceptually, this is true, it is not handled by the states per se, unless the interface code explicitly handles multiple conversations.

A naive interface may become {relation-name}.available when only one of the connected conversations has all the available data. This may, or may not be, a valid state. It's up to the interface author to ensure that the states set by the interface represent a valid description of the state of the interface or the interface provides functions to discover the more fine granularity of the different conversations (if that's important).

e.g. an interface could set {relation-name}.connected when the first conversations connects, but not set {relation-name}.available until n-from-m conversations are valid (due to the data on those conversations), if a quorum were necessary (say). It might then provide a set of methods to expose which were available (if that's important), or a single method with all of the relevant data that the charm layer might need to configure the managed service.

What charms.reactive doesn't do is manage states by conversation. There is no relationship between a particular conversation and a state. The states are at the interface level rather than the interface/conversation level.

simonklb commented 7 years ago

What charms.reactive doesn't do is manage states by conversation. There is no relationship between a particular conversation and a state. The states are at the interface level rather than the interface/conversation level.

I agree with you that all states are set on the unit. I don't agree that there is not relation between conversations and states. Each Conversation have a set_state method and the conversations are stored in the key-value store as a part of those states. This might just be a matter of semantics though.

While the current interface functionality isn't obvious I do like the idea of having an entity that gathers all of the unit-relations on an interface. I feel that the problem with the current conversation-model is that it masks the fact that one interface handles multiple relations by putting it all in a single state. To improve the usability and clarity of relations I propose:

Make interfaces have their own namespace relation or interface instead of just a normal flag.
Make one interface be a collection of multiple flags (one for each relation) instead of one flag.

I believe this would make the charm developers grasp the relation model much easier and it would make the decorators such as @when_not_all and @when_any intuitive and useful in regards to relations (see my earlier comment)

ajkavanagh commented 7 years ago

I agree with you that all states are set on the unit. I don't agree that there is not relation between conversations and states. Each Conversation have a set_state method and the conversations are stored in the key-value store as a part of those states. This might just be a matter of semantics though.

Indeed, the state is associated with the interface and thus, all of the conversations, but that's just a side effect of being associated with the interface.

One proposal for moving forward is to do essentially the opposite of what you are suggesting; to remove the 'special' feature of an interface state, and instead make the state completely generic. A rationale is as follows.

At present there is a complication in that an interface state is special. This means custom handling in charms.reactive and in charm code. By removing the special nature of the interface state we can simplify the code and the concepts associated with a state (or flag) going forward.

A flag simply represents that the charm code arrived at a particular point that was worth noting down in a flag to simplify future work and that that flag represents a relatively stable (between invocations) condition. Thus {relation-name}.connected would simple represent that something has connected to the relation. I'm even dubious of a {relation-name}.available state as its use is largely dependent on what the author of the interface code means by available. Usually it means that some data (possibly validated) has arrived from the other end of the relation. But as you point out, this clouds the issue around multiple conversations on the relation.

So the other part of the "not making interface states special" is to stop automatically supplying an instance of the RelationBase class when 'using' the interface state(s). The charm code would ask for the interface object needed, and that _might_be better served by just asking for the {relation-name} rather than {relation-name}.<something>.

Finally, interfaces generally become less 'special'; the simply become a layer which happens to have a RelationBase (or similar) derived class (or classes) in them. There's some other stuff around simplifying interface authoring to auto-generate, say, {relation-name}.connected and other boiler-plate code that crops up frequently in RelationBase derived classes.

So I think that fairly represents one of the proposals on the table. How does this map to your comment?

I agree that using singular states to (try to) represent the state of multiple conversations is confusing. I'm not sure that adding states will help. I wonder whether removing them is better. i.e. just having the {relation-base}.connected flag and then doing discovery in the handler to decide what to do next. Use the idea that handlers must be idempotent, and ensure that interfaces are relatively lazy and don't do too much work. If using {relation-name}.available use it to mean that the data available through the interface object is valid to be used, and that it appropriately maps to the number of conversations in play; the charm layer shouldn't really be worried about which or how many conversations are, if that can be delegated to the interface Class.

simonklb commented 7 years ago

I might be missing something that has been discussed offline or somewhere else but completely erasing interfaces sounds a bit drastic. Making them more similar to normal layers might be good, but it should be carefully thought out so that it doesn't muddy the waters. I think some separation between concepts are good to keep them separated in your head.

What sparked the idea of making interfaces a "multi-flag" entity was the proposal from @johnsca, in the first post, that flags could be implemented using specialized classes instead of limiting them to simple on/off toggles. I think that is a great idea since it would make it possible to create abstractions of complex logic (such as relations) while still maintaining the same flag handling using the reactive decorators.

I'm not convinced that the interfaces should hide the fact that there are multiple relations connected in the background. I agree that the relational data should be handled separately so that the data is exposed to the charm in a unified manner. What I don't think should be hidden is the state that each relation is in, hence my proposal.

canonical / charms.reactive

Proposal: States to Flags Refactor #92