Kubernetes storage engine

petervo commented 8 years ago

When running commissaire as a pod in kubernetes/openshift I think it's reasonable to assume that 1) There will only ever be one cluster, and 2) Commissaire's and Kubernetes' view of the cluster should be identical at all times.

This got me thinking that perhaps in this mode we can eliminate the need for a separate etcd instance by having an alternate CherryPyStorePlugin class that instead of writing to etcd, stores and retrieves it's data using annotations on the node objects in kubernetes' API.

I started playing with this idea a bit and ended up getting it sort of working by using the provided key combined with the json data to figure out what kubernetes object to retrieve/store data on. I didn't take this very far, while it does seem that this can work, it also is pretty hacky and likely to be fragile.

So I guess I have two questions here:

1) Do you think that this is a goal worth pursuing. I think it is, if for no other reason than in this common use case you avoid whole classes of problems by not having to worry about commissaire and kubernetes being out of sync. You also avoid any questions about persistent storage for etcd in the pod.

2) Assuming we want to pursue this how do you feel about expanding and formalizing the CherryPyStorePlugin API a little to include a bit more context than just list,get,set,delete. That way we can make the class plugable and support alternative implementations.

ashcrow commented 8 years ago

1) Do you think that this is a goal worth pursuing

I think so as well. @mbarnes, weigh in here as well please.

2) Assuming we want to pursue this how do you feel about expanding and formalizing the CherryPyStorePlugin API a little to include a bit more context than just list,get,set,delete. That way we can make the class plugable and support alternative implementations.

Absolutely. Irregardless to the first point ending up being workable point two opens up data storage to other providers. :+1: Added a card for this at https://trello.com/c/A7kTJqPS/250-multi-host-mgr-make-cherrypystorage-plugin-pluggable

mbarnes commented 8 years ago

Avoiding synchronization problems with (1) sounds like a win to me, though I don't know Kubernetes well enough to comment on the particulars. Agree with @ashcrow on (2): worth doing regardless.

petervo commented 8 years ago

Super, should I start I doc or a wiki page with some notes? Or is this a good place to discuss particulars?

ashcrow commented 8 years ago

@petervo Wiki or here is fine ... I'll leave the decision up to you. We've done one spike via the Wiki and it worked well as well as discussing via issues.

ashcrow commented 8 years ago

One thing that came to mind last night was that we do store more than just host/cluster data in etcd today. This doesn't negate the ability for points 1 and 2, but it's something to keep in mind as we design the updated CherryPyStorePlugin and implement the changes.

petervo commented 8 years ago

What specifically were you thinking about?

ashcrow commented 8 years ago

The main ones that come to mind:

Authentication data (when not using Kubernetes/OpenShift or local file for auth)
General configuration/Logging configuration (when stored external)
Operation State (though we can probably replace data updates with API calls to Kubernetes)
Global operations queue for HA and moving away from in server Operation Queue (Planned but not implemented)

I think they can all be addressed with a new design on the CherryPyStorePlugin moving away from the simplistic KV style to pluggable and more specific to the operations being done.

ashcrow commented 8 years ago

I'm going to create a template for proposals and then start fleshing out a CherryPyStoragePlugin document. @petervo please link the code you've been working with. I think that will help as we design this.

ashcrow commented 8 years ago

petervo commented 8 years ago

Thanks for putting this together. I threw away what i'd done before as it really wasn't the direction i wanted to go.

I guess the question for for me is if we want to keep the same basic structure and just be more explicit about what we are getting / setting with appropriately named methods/arguments. Or if we want to take are more ORMy type approach where we are handing off classes/instances for the store to populate or persist. I've been meaning to put together some some sample code for both approaches but haven't gotten around to it yet.

ashcrow commented 8 years ago

@petervo not a problem. Keep us posted via this issue and we will figure it all out together :smile:

ashcrow commented 8 years ago

cpd-126 is close to being finished. Feel free to start reviewing it early and commenting in this issue.

mbarnes commented 8 years ago

Overall looks pretty good to me so far.

The arguments to container_* vs internal_* methods seem overly tailored for their defaults. For example, what would I pass as the "podname" if I were using the EtcdStorePlugin for `container*methods? Could we combine the "pod_name" and "annotation" such that all the methods just take a single key string? (Or conversely, splitprimary_key` into separate path and base key (no slashes) arguments and then think of more generic argument names to apply to all the methods.)

petervo commented 8 years ago

I'd suggest not having to have the callers choose between internal and container, that can be left as an implementation detail.

What if instead the methods all take a kind argument from a list of constants. primary keys are then specific to that kind. Does that make sense? or should i write up a clearer wiki version of it?

petervo commented 8 years ago

@mbarnes comment overlapped with mine. I think we are saying the same thing. ei second split up the primary key into "Kind" and "key"

mbarnes commented 8 years ago

@petervo Can you give a couple examples of different "kinds"?

petervo commented 8 years ago

Well i was thinking Host and Cluster, mostly. But i see that could be difficult as in the etcd case you need the cluster name to know where to store the host.

Although possibly we can solve that by having the 'key' that gets passed be an instance that would have all the data you would need to save it. ei:

Cluster:
 - name

Host
 - cluster
 - name

Operation:
 - name
 - cluster

There are downsides of course, all stores have be to know about all key types.

The goal I'm trying to get at is to let the store plugin make all the decisions about where the data should go without the rest of the code having to consider it at all.

mbarnes commented 8 years ago

Yeah, I see what you're going for.

The instances you speak of to use as keys could perhaps be the model classes we already have in /src/commissaire/handlers/models.py. They're not quite where we need them to be; for one thing cluster models would need to learn their own cluster name. And we'd likely have to rework their __init__ method.

Haven't fully worked it out yet, but worth noting that the model classes have their own retrieve/save/delete wrapper methods which call the cherrypy.engine.pubish methods. So model instances could just pass themselves there.

petervo commented 8 years ago

Yes that could work, we'd need to make sure the current calls to store outside of any models are something we can switch to use a model.

ashcrow commented 8 years ago

If I am following right it sounds like it would make better sense to:

Collapse the proposal plugin methods down to save, get, delete, and list
Have the plugin methods be able to figure out which kind of data it's working with
Update models to be able to define what kind of each data it's working with

Does this sound right?

mbarnes commented 8 years ago

@ashcrow I think so, but let's first see if passing our existing models to the container_* methods is even viable before we collapse the plugin methods. We'll need to rework model initialization so they can also be used as unique keys with incomplete state.

The 2nd bullet could be just be checking for a specific model type like class InternalConfig(Model).

ashcrow commented 8 years ago

cpd-126 has been significantly updated based on the feedback here and further thoughts. Please take a look.

mbarnes commented 8 years ago

I think the extra level of indirection by way of StoreHandler makes sense.

It's less clear to me now how we specify where Commissaire's internal data (settings, etc.) should go. Or is it just always going to be etcd? (I'm okay with that.)

If the store_handler argument now has a "key=value" syntax, it seems to me we could just merge store_handler_kwargs into it and possibly drop the name field (if it's only for pairing the two).

@petervo and I were kicking around the idea of using model instances as a better abstraction than key_id strings. I'm still interested in trying it. That way each StoreHandler subclass could work out the appropriate etcd key path or Kubernetes annotation from the model data.

ashcrow commented 8 years ago

It's less clear to me now how we specify where Commissaire's internal data (settings, etc.) should go. Or is it just always going to be etcd? (I'm okay with that.)

The idea is that the StoreHandler would handle that by key_id. It would route via regular expressions. So if the key_id a setting of some kind it would route to whatever StoreHandler was registered with a matching regex.

I may not have made it as clear I was hoping in the CPD. The name value is how one would match the store_handler to its store_handler_kwargs. This would let you supply multiple store_handlers for different things. In theory you could mix and match and even spread out thing that in the previous iteration we called container and internal data across multiple handlers.

If the store_handler argument now has a "key=value" syntax, it seems to me we could just merge store_handler_kwargs into it and possibly drop the name field (if it's only for pairing the two).

True. Or maybe we can flatten it even more to have store_handler take all arguments directly. That way we don't need to have a match between the names.

@petervo and I were kicking around the idea...

I believe that would be doable with the above changes. I think it would be a model enhancement CPD to start utilizing the new system defined in cpd-126.

mbarnes commented 8 years ago

True. Or maybe we can flatten it even more to have store_handler take all arguments directly. That way we don't need to have a match between the names.

Yeah, I think that's what I meant. I was thinking of multiple store_handler arguments, each looking something like...

store-handler="class=commissaire.store.EtcdStorePlugin, ...other kwargs..."

If we use model instances instead of key ID strings, then the routing could just be based on the model's type instead of dealing with regular expressions. This seems cleaner to me since it avoids having to come up with a backend-agnostic key ID syntax. register_store_handler could then take a set of model types.

ashcrow commented 8 years ago

store-handler="class=commissaire.store.EtcdStorePlugin, ...other kwargs..."

:+1:

If we use model instances instead of key ID strings, then the routing could just be based on the model's type instead of dealing with regular expressions.

I like that! I'll update the CPD with that shortly at which point we should bug @petervo to take a gander :smile:.

ashcrow commented 8 years ago

Updated the cpd.

ashcrow commented 8 years ago

@mbarnes @petervo PTAL.

ashcrow commented 8 years ago

@mbarnes @petervo bump :smile:

petervo commented 8 years ago

Sry been without for a bit, this seems good to me.

ashcrow commented 8 years ago

No problem @petervo. Thanks for reviewing it!

ashcrow commented 8 years ago

FTR we are at Step 6 in the CPD process. Once we have 75%+ acceptance from the current active development team we will move this to accepted.

@ashcrow 👍 @petervo 👍 @mbarnes ♻️

Current status: 66%

mbarnes commented 8 years ago

Sorry for the delay. Couple little nits:

Perhaps change register_store_handler to take multiple model types for a handler.
- Prototype: register_store_handler ( store_handler, *model_types )
Under StoreHandler Methods / Proposed Usage the prototype needs updated.
For the Commissaire data that doesn't make sense to store in Kubernetes, we'll need a new model class for it but should that class be exposed to the plugin system or hard-coded internally to use EtcdStorePlugin?

No need to hold up acceptance over these nits though. I'm :+1:

ashcrow commented 8 years ago

Perhaps change register_store_handler to take multiple model types for a handler.

Sure, I'll update the CPD with that.

Under StoreHandler Methods / Proposed Usage the prototype needs updated.

Good catch. Will update.

For your last comment I think it should be part of the plugin system. While using etcd will be the most common use case IMO it will be nice to let folks use something else if they want to develop it.

I will update the CPD and then mark it accepted :fireworks:

ashcrow commented 8 years ago

@ashcrow :+1: @petervo :+1: @mbarnes :+1:

Current status: 100%

CPD is accepted.

mbarnes commented 8 years ago

@ashcrow I have a few tweaks to propose to the configuration parts of cpd-126, now that I'm diving into it. Posting here so it's public.

Minor, but I changed to choosing the backend by module name instead of class name like we do for authentication plugins, so it can be implemented similarly. So the config changes from class=commissaire.store.EtcdStoreHandler to name=commissaire.store.etcdstorehandler.
The comma-separated model list embedded in a comma-separated key=value list is difficult to handle, even with quoting. [1] (e.g. '... ,models="Host,Hosts,Cluster,Clusters", ...') I'd rather not write a custom parser since this stuff should really be in a JSON file anyway.

I see two options:
1. As much as I normally hate this, have the --register-store-handler option take a JSON string.
2. More radical but perhaps inevitable, let's abandon trying to configure Commissaire (at least the plugins) from a command-line and just mandate a commissaire.conf file to override the default localhost configuration.
I'm okay with either option in the short-term but I'd ultimately prefer (2).

[1] I already investigated Python's csv module, but it's of no help since the comma-separated value (as it sees it) is only partially quoted. (e.g. ,models="a,b,c", not ,"models=a,b,c",)

ashcrow commented 8 years ago

Minor, but I changed to choosing the backend by module name instead of class name like we do for authentication plugins, so it can be implemented similarly. So the config changes from class=commissaire.store.EtcdStoreHandler to name=commissaire.store.etcdstorehandler.

:+1:

As much as I normally hate this, have the --register-store-handler option take a JSON string.

I'd prefer this so that it is possible to use the CLI. But if it ends up adding a lot more work then we could go with ii as long as we document the heck out of it so it's obvious.

If you haven't already done so, please make updates to the cpd to keep it correct.

mbarnes commented 8 years ago

As much as I normally hate this, have the --register-store-handler option take a JSON string.

I'd prefer this so that it is possible to use the CLI. But if it ends up adding a lot more work then we could go with ii as long as we document the heck out of it so it's obvious.

Easy to implement, just yucky to use. But I'll go that way.

I'd like the auth-plugin CLI to be more consistent with the store handler CLI then. Tech debt?

ashcrow commented 8 years ago

I'd like the auth-plugin CLI to be more consistent with the store handler CLI then. Tech debt?

Found work. Throw a card in the backlog so we don't forget to update it :smile:

mbarnes commented 8 years ago

@ashcrow: Another tweak is I'm considering is to use glob-style wildcards (not regex's) in the models list. Because otherwise each and every model type would have to be listed in the config file, which is painful enough but then us adding a new type of model would break everyone's configuration.

So I'm thinking the idiom would be for the first defined handler to catch everything (models="*" -- and this would be the default if models is omitted), and then any subsequent handlers can override the first registration for specific model types (e.g. models="Host,Hosts").

Sound okay?

ashcrow commented 8 years ago

That works for me.

projectatomic / commissaire-mvp

Kubernetes storage engine #126