flant / addon-operator

A system to manage additional components for Kubernetes cluster in a simple, consistent and automated way.
https://flant.github.io/addon-operator/
Apache License 2.0
483 stars 27 forks source link

Regression: From version 1.3.6 inclusive, it is not possible to patch the global key during an Modified event #474

Closed piotrminkina closed 3 months ago

piotrminkina commented 5 months ago

Expected behavior (what you expected to happen): Patching the global key during the Modified event should be possible, as it was up to version 1.3.4 inclusive.

Actual behavior (what actually happened): Since version 1.3.6 inclusive, an attempt to update the global key ends with an error as below, even though I am not tampering with the key that is the subject of the error. The versions I have tested and have this error: 1.3.12, 1.3.8, 1.3.7, 1.3.6. The versions I have tested without error: 1.3.4, 1.3.3, 1.3.2.

[...] msg="Global hook failed, requeue task to retry after delay. [...]. Error: cannot apply values patch for global values: 2 errors occurred:\n\t* global.enabledModules is a forbidden property\n\t[...]" binding=kubernetes binding.name=monitor-workloads-deployments [...] hook.type=global queue=main [...] watchEvent=Modified

The patch I dump to the file indicated in the VALUES_JSON_PATCH_PATH variable:

[{'op': 'add', 'path': '/global/apps/slo/some-namespace/some-workload', 'value': {}}]

Steps to reproduce:

  1. Implement below hook script as a global hook.
  2. In ConfigMap, prepare global values that will create a nested structure /global/apps/slo/some-namespace/some-workload and apply it.
  3. Create a deployment and apply it.
  4. Force a change in the definition of deployment.

Environment:

Anything else we should know?:

Additional information for debugging (if necessary):

I simplified the script to the minimum necessary.

Hook script ```python #!/usr/bin/env python3 import json import os import sys from benedict import benedict # language=yaml EVENTS_BINDING_CONFIG = r''' --- configVersion: v1 kubernetes: - name: monitor-workloads-deployments apiVersion: apps/v1 kind: Deployment keepFullObjectsInMemory: false jqFilter: | { "name": .metadata.name, "namespace": .metadata.namespace } ''' BINDING_CONTEXT_PATH = os.environ.get('BINDING_CONTEXT_PATH') VALUES_PATH = os.environ.get('VALUES_PATH') VALUES_JSON_PATCH_PATH = os.environ.get('VALUES_JSON_PATCH_PATH') values = {} values_patch_ops = [] def handle_object_update(item_name, object_data): data_name = object_data['name'] data_namespace = object_data['namespace'] if item_name.startswith('monitor-workloads-'): k8s_workload_name = data_name k8s_namespace_name = data_namespace values_patch_ops.append({ 'op': 'add', 'path': f'/global/apps/slo/{k8s_namespace_name}/' f'{k8s_workload_name}', 'value': {} }) def handle_event(): with open(BINDING_CONTEXT_PATH, 'r') as binding_context_file: binding_context = json.load(binding_context_file) for context_item in binding_context: item_name = context_item['binding'] item_type = context_item['type'] if item_type == 'Event': item_event = context_item['watchEvent'] object_data = context_item['filterResult'] if item_event == 'Modified': handle_object_update(item_name, object_data) with open(VALUES_JSON_PATCH_PATH, 'w') as values_json_patch_file: print(values_patch_ops) json.dump(values_patch_ops, values_json_patch_file) if __name__ == '__main__': if len(sys.argv) > 1 and sys.argv[1] == '--config': print(EVENTS_BINDING_CONFIG) exit(0) values = benedict(VALUES_PATH, format='json') handle_event() ```
Logs
time="2024-04-24T17:13:22Z" level=info msg="GlobalHookRun task for 'kubernetes/monitor-workloads-deployments' binding, trigger is Kubernetes" binding=kubernetes binding.name=monitor-workloads-deployments event.id=dbe85418-c2a8-48b4-ac5f-c46e8347656b hook=sync-global-values-hook.py hook.type=global queue=main task.flow=start task.id=11b950ad-efc7-4847-b20a-a9dae4ab4930 watchEvent=Modified
time="2024-04-24T17:13:22Z" level=info msg="[{'op': 'add', 'path': '/global/apps/slo/some-namespace/some-workload', 'value': {}}]" binding=kubernetes binding.name=monitor-workloads-deployments event.id=dbe85418-c2a8-48b4-ac5f-c46e8347656b hook=sync-global-values-hook.py hook.type=global output=stdout queue=main task.id=11b950ad-efc7-4847-b20a-a9dae4ab4930 watchEvent=Modified
time="2024-04-24T17:13:22Z" level=error msg="Global hook failed, requeue task to retry after delay. Failed count is 363. Error: cannot apply values patch for global values: 2 errors occurred:\n\t* global.enabledModules is a forbidden property\n\t* global.apps.slo.some-namespace.some-workload is required\n\n" binding=kubernetes binding.name=monitor-workloads-deployments event.id=dbe85418-c2a8-48b4-ac5f-c46e8347656b hook=sync-global-values-hook.py hook.type=global queue=main task.id=11b950ad-efc7-4847-b20a-a9dae4ab4930 watchEvent=Modified
miklezzzz commented 3 months ago

@piotrminkina Hello! Thanks for such comprehensive information! May I ask to check if any openapi schema is applied to the global values in your case? According to the documentation, you can find it in the $GLOBAL_HOOKS_DIR/openapi directory

piotrminkina commented 3 months ago

Yes, I have the following YAMLs:

# config-values.yaml
type: object
additionalProperties: false
required:
  - dataMaxLength
  - operatorReleaseName
  - apps
properties:
  dataMaxLength:
    type: integer
    minimum: 65536
  operatorReleaseName:
    type: string
  apps:
    type: object
    additionalProperties: false
  [...]
# values.yaml
---
x-extend:
  schema: config-values.yaml
type: object
additionalProperties: false
miklezzzz commented 3 months ago

could you possibly update your values.yaml by adding an additional enabledModules property so that it would look like:

# values.yaml
---
type: object
default: {}
additionalProperties: false
properties:
  enabledModules:
    type: array
    items:
      type: string

and give it a try? It seems we unwittingly enforced sort of a requirement for customers' openapi schemas to contain this enabledModules key by having this global values patch applied in the operator's logic.

piotrminkina commented 3 months ago

@miklezzzz It seems that after this change, object updates are implemented correctly. Thanks!

What do you plan to do about it next? Some kind of update in the documentation or what?

miklezzzz commented 3 months ago

@piotrminkina We'll try to make it seamless for users by adding missing properties to the final values schema.

miklezzzz commented 3 months ago

Fixed in v1.13.3