RamenDR / ocm-ramen-samples

OCM Stateful application samples, including Ramen resources
Apache License 2.0
7 stars 66 forks source link

Support for automated testing for k8s and odr #43

Closed nirs closed 8 months ago

nirs commented 8 months ago

Currently we have one application (busybox) and kustomizations to deploy it manually on regional-dr or metro-dr using rbd or cephfs storage.

For automated testing we need subscription kustomization for each application variant (rdr-rbd, rdr-cephfs, mdr-rbd).

The layout should make it easy to add more applications (e.g. busybox statefulset, busybox daemonset, kubevirt vms with pvc, data-volume, or data-volume-template).

How it should work

Suggested layout

busybox-deployment/
    odr-rdr-rbd/
        kustomization.yaml
    odr-rdr-cephfs/
        kustomization.yaml
    odr-mdr-rbd/
        kustomization.yaml
    k8s-rdr-rbd/
        kustomization.yaml
    deployment.yaml
    kustomization.yaml
    pvc.yaml
subscription/
    odr/
        busybox-deployment-rdr-rbd/
            kustomization.yaml
        busybox-deployment-rdr-cephfs/
            kustomization.yaml
        busybox-deployment-mdr-rbd/
            kustomization.yaml
    k8s/
        busybox-deployment-rdr-rbd/
            kustomization.yaml
    binding.yaml
    channel.yaml
    kustomization.yaml
    namespace.yaml
    placement.yaml
    subscription.yaml

Usage in OpenShift console

When selecting application path use one of:

busybox-deployment/odr-rdr-rbd
busybox-deployment/odr-rdr-cephfs
busybox-deployment/odr-mdr-rbd

Usage in automated tests

When deploying a subscription in automated tests use one of these:

Subscriptions for odr tests:

subscription/odr/busybox-deployment-rdr-rbd
subscription/odr/busybox-deployment-rdr-cephfs
subscription/odr/busybox-deployment-mdr-rbd

Subscriptions for k8s tests:

subscription/k8s/busybox-deployment-rdr-rbd
raghavendra-talur commented 8 months ago

The proposal looks good to me. This will make adding applications to the samples repository very easy. Thanks!

ShyamsundarR commented 8 months ago

Broadly agree to the scheme except actually creating the directories for every combination of the subscription and workload, as that would leave a lot of directories and would impact the repository readability IMHO (more below).

For the actual workloads itself the scheme above is fine, just fine grained it as below:

│   └── workloads
│       └── busybox-deployment
│           ├── base
│           │   ├── busybox-deployment.yaml
│           │   ├── busybox-pvc.yaml
│           │   └── kustomization.yaml
│           └── odf-regional-rwo
│               └── kustomization.yaml

In the above, odf-regional-rwo is for Ceph-RBD and we can have 2/3 more directories for the variations, rwx and metro. These are for use with the ACM console as is. (others would be odf-regional-rwx, odf-metro-rwo, odf-metro-rwx)

Further I suggest we do not provide any more ACM console ready workload directories for other types, and reduce clutter. Instead we can kustomize the workloads as deployed by Subscriptions or ApplicationSets as below.

Given a workload, we want to potentially kustomize the following for the workload:

All of the above for Subscriptions [1] and ApplicationSets [2] can be achieved using the workload kustomization specification in these resources.

Based on the above, other workload form factors can be kustomized when added from the console, IOW the Subscription or the ApplicationSet YAML can be edited according to the environment. This reduces clutter in the repository as other wise these 4 directories will keep repeating itself for every workload.

Further as we move forward with clusters that consume ODF created storage instances from another cluster, the StorageClass names would change and be non-specific, hence moving towards providing these values from the Subscription would be more usable than actually having them in the repository hard coded.

For the Subscriptions themselves the structure laid out is fine (with the change to add a base):

│   ├── subscriptions
│   │   ├── base
│   │   │   ├── binding.yaml
│   │   │   ├── kustomization.yaml
│   │   │   ├── subscription.yaml
│   │   |   └── placement.yaml
│   │   └── busybox-deployment
            └── kustomization.yaml

Again here I suggest we do not provide overlays for every combination that includes partial hard-coded paths, and instead provide a base kustomization, in subscriptions/busybox-workloads for example, that contains rules to kustomize the resources deployed to the hub, and to kustomize the workload resources as above.

For hub resources we would want:

Now, using this from automated tools could be as follows:

The DRPC itself needs:

The DRPC being a part of the repository is useful, as it serves as an example to keep the hub resources declarative as well.

I think we should discuss this a little more and close on it.

[1] Subscription workload kustomization: https://github.com/open-cluster-management-io/multicloud-operators-subscription/blob/main/docs/gitrepo_subscription.md#kustomize

[2] ApplicationSets workload kustomization: https://argo-cd.readthedocs.io/en/stable/user-guide/kustomize/

nirs commented 8 months ago

Broadly agree to the scheme except actually creating the directories for every combination of the subscription and workload, as that would leave a lot of directories and would impact the repository readability IMHO (more below).

But this is the goal of this work - making it easy to test:

Enabling DR for an application is different, the drpc requires too much customization to adapt to the application, subscription or applicationset, cluster names etc. I don't plan to provide a working drpc for every sample. This is best done by a tool, and currently implemented in drevn.test module: https://github.com/nirs/ramen/blob/7aae6e1d7af362efbc6a1f28a6340444b7594d6a/test/drenv/test.py#L167

We need to agree on this goal - if you want to keep this repository clean than this is not the right place to keep the testing resource, and we need another repo.

For the actual workloads itself the scheme above is fine, just fine grained it as below:

│   └── workloads
│       └── busybox-deployment
│           ├── base
│           │   ├── busybox-deployment.yaml
│           │   ├── busybox-pvc.yaml
│           │   └── kustomization.yaml
│           └── odf-regional-rwo
│               └── kustomization.yaml

Looks nicer this way.

In the above, odf-regional-rwo is for Ceph-RBD and we can have 2/3 more directories for the variations, rwx and metro. These are for use with the ACM console as is. (others would be odf-regional-rwx, odf-metro-rwo, odf-metro-rwx)

Why use the pvc access mode instead of the storage class name? This makes it harder to use for testing. We know that we have rbd and cephfs on ocp, and rbd on drenv. It is easy to pick the right configuration when you want to run a test. With access mode, I don't know which version can be used on which cluster.

Maybe this is again something which is better for the samples use case and not for managing set of testing configurations?

Further I suggest we do not provide any more ACM console ready workload directories for other types, and reduce clutter. Instead we can kustomize the workloads as deployed by Subscriptions or ApplicationSets as below.

But this means we don't have a way to test deployment without OCM. I think this is the wrong trade-off, optimizing for cleaner repository instead for ease of use for developers.

Given a workload, we want to potentially kustomize the following for the workload:

  • PVCs StorageClass name
  • PVCs AccessMode
  • Workload namespace
  • Common workload label
  • Workload resources suffix

Also pvc selector (later volume snapshot selector and imperative apps selectors). This is the current configuration for a workload: https://github.com/nirs/ramen/blob/test-path/test/basic-test/config.yaml#L5

All of the above for Subscriptions [1] and ApplicationSets [2] can be achieved using the workload kustomization specification in these resources.

Based on the above, other workload form factors can be kustomized when added from the console, IOW the Subscription or the ApplicationSet YAML can be edited according to the environment. This reduces clutter in the repository as other wise these 4 directories will keep repeating itself for every workload.

If we need to edit yamls manually at deploy time we failed to provide good way to test. My goal is to eliminate these manual steps, so it is easy to reproduce the same workload using shared configuration.

Further as we move forward with clusters that consume ODF created storage instances from another cluster, the StorageClass names would change and be non-specific, hence moving towards providing these values from the Subscription would be more usable than actually having them in the repository hard coded.

This is a big usability issue if you cannot have working workloads and need to customize them manually for every deployment. If this is only about storage class name it can be solved by forking the repo and creating a version with the right storage class for your specific setup. If this is something we test regularly I expect to keep ready configuration for testing for this variant.

For the Subscriptions themselves the structure laid out is fine (with the change to add a base):

│   ├── subscriptions
│   │   ├── base
│   │   │   ├── binding.yaml
│   │   │   ├── kustomization.yaml
│   │   │   ├── subscription.yaml
│   │   |   └── placement.yaml
│   │   └── busybox-deployment
            └── kustomization.yaml

Looks better like this.

Again here I suggest we do not provide overlays for every combination that includes partial hard-coded paths, and instead provide a base kustomization, in subscriptions/busybox-workloads for example, that contains rules to kustomize the resources deployed to the hub, and to kustomize the workload resources as above.

For hub resources we would want:

  • Workload namespace
  • Common workload label
  • Workload resources suffix

Now, using this from automated tools could be as follows:

  • e2e or basic-test

    • Instead of creating config files for all combinations, let the tools provide options to choose:

    • --workload

      • The values for these are already known (i.e workloads is all workloads in the samples and such
    • --storageclassname --PVCModes

    • Assume or default the namespaces, labels and suffix or provide options

    • IOW, let there be known configs in code than as files

Having to customize using a tool means we cannot test resource using kustomize build and we cannot apply the resources using kubectl apply -k. The only way to use them will be via the tool.

The DRPC itself needs:

  • Policy: make this a user input
  • PVC label selector: Can be formed based on earlier inputs from labels

The drpc needs more - this is the current implementation: https://github.com/nirs/ramen/blob/7aae6e1d7af362efbc6a1f28a6340444b7594d6a/test/drenv/test.py#L190

This is the reason I don't want to depend on static drpc resource in the repo, and instead generate it for every deploy.

The DRPC being a part of the repository is useful, as it serves as an example to keep the hub resources declarative as well.

Agree, but maybe one example is good enough, one that need to be modified to match the application and cluster.

We can also add a drpolicy and drcluster samples to match the sample drpc. They will also have to adjusted to the actual cluster (e.g. managed cluster names).

nirs commented 8 months ago

Notes from discussion with Shaym and Talur: