wave-k8s / wave

Kubernetes configuration tracking controller
Apache License 2.0
681 stars 81 forks source link

Handle env section in deployment yaml #34

Closed andyedwardsibm closed 5 years ago

andyedwardsibm commented 5 years ago

Changes made:

Testing done:

What is not in this PR:

andyedwardsibm commented 5 years ago

I've made the specific code changes suggested and committed them. For the discussion on the overall architecture of the change, I've tried to respond with my thoughts to each point below.

It's worth saying that for this PR, I tried to maintain the existing architecture that discovery and collection of ConfigMaps/Secrets happens at the point at which we scan the deployment (we "get" the whole ConfigMap/Secret and keep it in memory), and processing that collected data is done separately at a later stage (i.e. where the code calculates the hash value). This is why I record keys (and whether they're optional) in a new type, and then change the hashing code to handle the extra metadata. The fact that both children.go and hash.go have changed is due to trying to maintain that behaviour, and not necessarily because of handling the optional field.

My overriding thought when reading this is that the work done here on optional fields doesn't add any value and rather over-complicates the code. If a field is optional, we report an error when it doesn't exist, but actually, do we care? If the field isn't there, and we just set it to an empty string, the hash would change once the field exists again, therefore having the desired effect? Do you agree or have I missed something?

It's worth saying that I don't think the code reports an error when an optional field does not exist, in all cases. I think it only reports an error when an optional field does not exist because the entire ConfigMap/Secret does not exist.

The benefit I see to knowing if fields are optional is that we wouldn't throw an error if a configmap/secret doesn't exist but all the fields we want from within it are optional, which we are not currently handling (and I'm not suggesting we do within this PR, I'll create an issue to follow up on that)

I suspect that might be trivial to add; Just make getConfigMapWithKey()/getSecretWithKey() return an empty getResult when the object does not exist and the requested key is optional. I think getCurrentChildren() would then just ignore the empty getResult object. Any required keys would still result in an error.

If we remove the requirement to know if fields are optional or not, this whole PR could be massively simplified. getChildNamesByType could be rewritten to return two map[string]map[string]struct{} where the first map is the name of the configmap/secret and the second is the the fields from within the configmap/secret that are being requested.

That is close to what I originally had before adding the handling of optional values (with the caveat that the hashing code was also changed to handle both whole objects and specified keys within objects).

Then this set of fields can be passed into getObjectMap[WithKeys]? and the filtering of the fields could be done there instead, returning the object with a possible subset of the expected data fields. (this also means we would only perform a Get on each object once).

Then no changes are required to be made to hashing algorithm and I think the extra types you've added would no longer be needed either.

If the hash code is going to maintain a simple interface and only accept an array of one type of thing, I think that hashing code has to change either way: the current code expects to find an Object, so to maintain that, at the point I'm scanning the Deployment and pulling in the ConfigMaps/Secrets, for anything that's specifying an individual key (optional or not), would I have to create a fake Object so I can pass it in to the hashing code? Alternatively, we could process the ConfigMaps/Secrets when discovering them and stringify them all then, but would we then change the hashing code to accept the stringified data rather than an array of Objects/configObjects?

I'm thinking that either the hashing code changes to handle key-specific entries, or the ConfigMaps/Secrets discovery code changes to create "spoof" Objects. I chose the first as it seemed more explicit on what it was doing.


Coming back to this after a couple of days, I suspect/agree that you're right that the optional handling can be cleaner. I could, at the point of discovering ConfigMaps/Secrets (i.e. in getChildNamesByType()) also inspect the ConfigMap/Secret. If the field is missing, I decide whether to error or continue based on whether that field is optional. For required fields, I can error (as I think I should - you required a specific field and it isn't there), and for optional fields I silently do nothing (as we surmise we should - it's missing for whatever reason, but it's optional).

That then still gives us an array of configObjects to handle, where that list has been pruned such that there are no duplicates at the ConfigMap/Secret scope (if it's there it's either for the whole object or for a list of keys in that object, but never both).

The hash code would still need to change to accommodate specific keys.

Does my proposed simplification make sense? I think it's the same as yours but with the hash code changing slightly (is it a key, get the key and stringify it, if not stringify the whole thing).

andyedwardsibm commented 5 years ago

I've thought about this a bit more over night and spotted a problem with the proposal to simplify, and to only record an optional field when it is present. Consider the following case:

When I call updateOwnerReferences() I must include the referenced ConfigMap/Secret so that I can attach as an owner and trigger recalculating the hash should the key then be added.

Given that updateOwnerReferences() must be passed a list of all ConfigMaps/Secrets that exist and that I am referencing, irrespective of whether specific fields exist (because they may exist in the future), I cannot filter out an entry when I encounter an optional and missing field.

I think that then goes back to your original idea that: if we record the data at the point of scanning, and for missing optional fields we set the data to an empty string then we can handle that case. But I still think that route involves changing the hash function, and either modifying or build-a-fake-ing an Object (and I'm not sure I'd understand what updateOwnerReference() does with that "fake" Object well enough to not get it wrong).

JoelSpeed commented 5 years ago

I keep re-reading this and thinking about the best approach and I think I'm coming around to where we have got with this already, or at least some middle ground between our two initial suggestions.

Do you think this would work:

  1. If getChildNamesByType returned two map[string]configObject but where configObject was:
    type configObject {
    required bool
    allKeys bool
    keys map[string]struct{}
    }

When parsing the deployment, getChildNamesByType could populate the keys map for any single fields referenced, if no subkeys are referenced it would mark allKeys true and if there is a non-optional field it would mark required true.

  1. getConfigMap/getSecret could then use the required field to determine whether or not to throw an error if the referenced ConfigMap or Secret does not exist.

  2. calculateConfigHash could then use the allKeys field to determine whether to filter the keys it adds to its hashSource and would just set any key referenced which doesn't exist to "" (since changing the value will trigger a rollout anyway, this should be fine). Importantly I don't think calculaterConfigHash needs to know about optional fields, but we do need to know whether to ignore a ConfigMap or Secret if it only contains optional fields which should be sorted by 2.

In this idea, we solve the problem of optional config objects that don't exist; all config objects that exist and are referenced (optionally or not) will get owner refs and the owner ref code won't be changed.

For required fields, I can error (as I think I should - you required a specific field and it isn't there), and for optional fields I silently do nothing (as we surmise we should - it's missing for whatever reason, but it's optional).

I suggest we don't throw an error if required fields don't exist, in fact, my suggestion doesn't track which fields are optional or not at all. The reason for this being that, if a field is required or not does not affect us calculating a hash, if the field becomes present at a later time then the hash will change and Wave's responsibility is still fulfilled. It is not Wave's responsibility to track whether fields are optional or not, it is the deployment controller's. Equally, if a field is non-optional, and doesn't exist, then the deployment will go into a bad state anyway, so whether Wave is adding a hash or not doesn't particularly matter.

I guess to summarise my point here is, required fields not being present won't prevent the hash changing when the configmaps or secrets change, so why does Wave care?

andyedwardsibm commented 5 years ago

I suggest we don't throw an error if required fields don't exist, in fact, my suggestion doesn't track which fields are optional or not at all. ... It is not Wave's responsibility to track whether fields are optional or not, it is the deployment controller's.

Yes, this started to dawn on me halfway through reading point 3 :)

From an initial read through, I think your plan above works. The one thing I'm not clear of is the function interfaces at all points in the chain, as there is a description above for the interface for getChildNamesByType() but not for getCurrentChildren().

I think that I still need the list returned by getCurrentChildren() to contain the ConfigMap/Secret under k8sObject for each entry, so that updateOwnerReferences() / updateOwnerReference() can just update the owner and push it back at k8s core, and so that the hashing can just use the data we've already collected.

I was trying to think how to have both getCurrentChildren() and getChildNamesByType() use configObject, like maybe one builds the list from the deployment, and the other fills in the k8sObject but rapidly started worrying about concurrent map access.

Would something like this be okay?

In types.go:

// This is the main interface for what comes out of children.go, 
// such that owner_references.go and hash.go act on a list of these
type configObject struct {
  k8sObject Object
  required bool
  allKeys bool
  keys map[string]struct{}
}

In children.go:

// Interface for results from getConfigMap()/getSecret()
type getResult struct {
  err error
  obj Object
  required bool
  allKeys bool
  keys map[string]struct{}
}

// Interface for results from getChildNamesByType()
type configMetadata struct {
  required bool
  allKeys bool
  keys map[string]struct{}
}

func (h *Handler) getCurrentChildren(obj *appsv1.Deployment) ([]configObject, error) {
  // getChildNamesByType() to get two maps

  // loop through each map, calling getConfigMap()/getSecret() on a thread

  // Wait for each "get" call and add result to a list of configObject

  // return list of configObject
}

func getChildNamesByType(obj *appsv1.Deployment) (map[string]configMetadata, map[string]configMetadata) {
  // Walk through deployment creating two consolidated maps (keyed on ConfigMap/Secret name)
  // Each map lists the metadata for that ConfigMap/Secret in an instance of configMetadata.
}

// There is only one of these, so no "WithKeys" version any more.
func (h *Handler) getConfigMap(namespace, name string, required allKeys bool, keys map[string]struct{}) getResult {
  ...
}

// There is only one of these, so no "WithKeys" version any more.
func (h *Handler) getSecret(namespace, name string, required allKeys bool, keys map[string]struct{}) getResult {
  ...
}

owner_references.go stays as is in this PR.

hash.go still has the interface func calculateConfigHash(children []configObject) (string, error) but is a bit simpler, and does not return errors when handling optional fields.

I'd like to give this a day in the back of my brain, in case I spot anything else, but if not then are you okay if I start coding to the interfaces in this comment?

JoelSpeed commented 5 years ago

Yep, the above looks pretty good to me, just a couple of comments on it:


I think that I still need the list returned by getCurrentChildren() to contain the ConfigMap/Secret under k8sObject for each entry, so that updateOwnerReferences() / updateOwnerReference() can just update the owner and push it back at k8s core, and so that the hashing can just use the data we've already collected.

Yep agreed, the passing around will be similar to how it is currently implemented.


Any reason not to pass the configMetadata directly into getConfigMap and getSecret?

 // There is only one of these, so no "WithKeys" version any more.
func (h *Handler) getConfigMap(namespace, name string, meta configMetadata) getResult {
  ...
}

I really dislike the k8sObject field inside configObject, can we just call it object instead? (I've seen this done in other Kubernetes objects and interfaces like the unstructured.Unstructured for instance)


Thanks for working with me on this design btw, I appreciate your efforts

andyedwardsibm commented 5 years ago

No worries; if the result is better code then I'm all for it!

Rolling in your latest comment, I'll go with this:

In types.go:

type configObject struct {
  object Object
  required bool
  allKeys bool
  keys map[string]struct{}
}

In children.go:

type getResult struct {
  err error
  obj Object
  required bool
  allKeys bool
  keys map[string]struct{}
}

// Interface for results from getChildNamesByType()
type configMetadata struct {
  required bool
  allKeys bool
  keys map[string]struct{}
}

func (h *Handler) getCurrentChildren(obj *appsv1.Deployment) ([]configObject, error) {
  // getChildNamesByType() to get two maps
  // loop through each map, calling getConfigMap()/getSecret() on a thread
  // Wait for each "get" call and add result to a list of configObject
  // return list of configObject
}

func getChildNamesByType(obj *appsv1.Deployment) (map[string]configMetadata, map[string]configMetadata) {
  // Walk through deployment creating two consolidated maps (keyed on ConfigMap/Secret name)
  // Each map lists the metadata for that ConfigMap/Secret in an instance of configMetadata.
}

func (h *Handler) getConfigMap(namespace, name string, metadata configMetadata) getResult {
  ...
}

func (h *Handler) getSecret(namespace, name string, metadata configMetadata) getResult {
  ...
}

owner_references.go stays as is in this PR.

hash.go still has the interface func calculateConfigHash(children []configObject) (string, error) but is a bit simpler, and does not return errors when handling optional fields.

JoelSpeed commented 5 years ago

Sounds great, thanks 👍

andyedwardsibm commented 5 years ago

All done, I think.

I had to use pointers to configMetadata in the maps that getChildNamesByType() returns as I needed to update the required field as I scanned the deployment, and golang doesn't allow editing the contents of a struct in a map value.

I've built the image locally and run it through a few scenarios as well, testing in minikube, and it seems to work. I've not done any load testing though.

andyedwardsibm commented 5 years ago

Code updated in line with review. I've done my usual manual testing in minikube and the one case that doesn't work is where a field in a ConfigMap is optional, and the whole map does not exist (so we cannot attach an owner). If I then create that ConfigMap, Wave does not trigger calculation of the hash, since the new ConfigMap has no owner. This feels like an edge case, and I'm not sure how to "fix" it without a complete rearchitecture of how Wave works...