kubeflow / testing

Test infrastructure and tooling for Kubeflow.
Apache License 2.0
63 stars 89 forks source link

[AWS] Infrastructure as Code #900

Closed PatrickXYS closed 3 years ago

PatrickXYS commented 3 years ago

This is a foresee issue to handle below scenarios:

  1. Community folks feel test infra is more like a black box, without having deeper understanding.
  2. As we grow, we may find that manually configure AWS resources (Note: not K8s object), either from AWS CLI or Console could be a burden and there's no way to keep track.
  3. Replicating test infra may be impossible to do so within 1 day, setting up IaC can help us replicate within a few hours or so.

There are some popular solutions, including terrafarm and cdk, my preference is around cdk because it's developed by AWS, and should have a promising support for any new coming AWS resources.

PatrickXYS commented 3 years ago

Let's divide the question into 2 layers, AWS resources layer and K8s resources layer, then we decouple two layers resources.

Of course there are overlapping between two layers' resources, such EFS will serve as K8s's NFS. But we can start from minimal and resolve issues step by step

PatrickXYS commented 3 years ago

After a bit research, for k8s resources, we can use CDK8s (Object Oriented Language to generate k8s resource YAML file) + Flux/ArgoCD (Tool for GitOps).

PatrickXYS commented 3 years ago

I feel like using CDK8s brings more complexity but did not really simplify our work.

Another proposal on k8s resources management: helm + ArgoCD

The reason I propose helm is it provides a simple way to inject parameter into manifests without large change.