mayadata-io / d-operators

Declarative patterns to write kubernetes controllers
Apache License 2.0
10 stars 7 forks source link

generate ChaosEngine from a deployment under chaos #25

Open ksatchit opened 4 years ago

ksatchit commented 4 years ago

Requirement:

Considerations:

ksatchit commented 4 years ago

TODO:

AmitKumarDas commented 4 years ago

@ksatchit can the chaos engine resources & schedule resources be specified here. The yaml versions are sufficient.

ksatchit commented 4 years ago

A typical ChaosEngine today looks like the following. The user generally changes (based on actual usage info): the .spec.appinfo section while keep the rest of the changes are recommended. More info is provided here: https://docs.litmuschaos.io/docs/chaosengine/ (multiple optional fields exist).

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: nginx-chaos
  namespace: default
spec:
  appinfo:
    appns: 'default'
    applabel: 'app=nginx'
    appkind: 'deployment'
  annotationCheck: 'true'
  engineState: 'active'
  chaosServiceAccount: pod-delete-sa
  jobCleanUpPolicy: 'delete'
  experiments:
    - name: pod-delete
      spec:
        components:
          env:
            # set chaos duration (in sec) as desired
            - name: TOTAL_CHAOS_DURATION
              value: '30'

            # set chaos interval (in sec) as desired
            - name: CHAOS_INTERVAL
              value: '10'

            # pod failures without '--force' & default terminationGracePeriodSeconds
            - name: FORCE
              value: 'false'

The schedule is closed source today (and is essentially chaosengine++), but is planned to be converted to a separate CR to hold on that data: i.e., schedule. We can take a shot at it here (i.e., in a separate issue)

AmitKumarDas commented 4 years ago

A deployment is expected to have following annotations

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deploy
  namespace: my-ns
  annotations:
    chaos.litmus.io/enabled: true
    engine-generate.chaos.litmus.io/enabled: true
    pod-delete.experiment.chaos.litmus.io/enabled: true

This is a sample metac.yaml config

apiVersion: metac.openebs.io/v1alpha1
kind: GenericController
metadata:
  name: chaosengine-generator-for-deployment
  namespace: doperator
spec:
  watch:
    apiVersion: apps/v1
    resource: deployments
  attachments:
    apiVersion: litmuschaos.io/v1alpha1
    kind: chaosengines
    advancedSelector:
      selectorTerms:
      - matchReferenceExpressions:
        - key: metadata.annotations.generator\.chaosengine\.litmus\.io/uid
          refKey: metadata.uid

This is a sample desired ChaosEngine

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: <can-this-be-deployment-name?>
  namespace: <can-this-be-deployment-namespace>
  annotations:
   generator.chaosengine.litmus.io/uid: <deployment-under-test-uid>
spec:
  appinfo:
    appns: <will be derived from deployment>
    applabel: <will be derived from deployment's images e.g. 'app=nginx'>
    appkind: deployment
  annotationCheck: 'true'
  engineState: 'active'
  chaosServiceAccount: <how-to-derive? e.g. pod-delete-sa>
  jobCleanUpPolicy: 'delete'
  experiments:
    - name: pod-delete
      spec:
        components:
          env: <how-to-set? e.g. should it refer to config CR to set below>
            # set chaos duration (in sec) as desired
            - name: TOTAL_CHAOS_DURATION
              value: '30'
            # set chaos interval (in sec) as desired
            - name: CHAOS_INTERVAL
              value: '10'
            # pod failures without '--force' & default terminationGracePeriodSeconds
            - name: FORCE
              value: 'false'