Closed rossf7 closed 8 months ago
This is great, thank you for opening the issue & listing the initial components!
As an example, I used Flux to deploy Kepler in this repo: https://github.com/nikimanoledaki/sustainability-journey-with-gitops
I ran the following bootstrap command to install and bootstrap Flux and specify that it should reconcile the repo's clusters/
dir:
curl -s https://fluxcd.io/install.sh | sudo bash
flux bootstrap github --owner=$GITHUB_USER --repository=green-reviews-tooling --path=clusters
Thankfully past me documented the steps in the README! 👍
At the time there wasn't a Helm Chart so I used Kustomize to deploy the k8s manifests but we should change that to use the Helm Chart, as you said :)
Docs for bootstrapping Flux with a GitHub repo: https://fluxcd.io/flux/installation/bootstrap/github/
Note: We'll also need to export a GitHub token before running the GitHub command - will create and send it to you privately!
export GITHUB_TOKEN=<gh-token>
We should also think about having multiple environments. Looking at the Flux docs on structuring repositories for guidance. Here are some ideas - they might not all be viable 🤔
Here is an initial idea that we can iterate on to deploy the individual components/apps:
├── apps
├── production
└── development
apps/production
would include:
apps/development
could be for the manual pipeline that includes the above as well as Falco & the demo workload. In production, this would ideally be configured and maintained by the project maintainers.
We could potentially add the cluster and/or infrastructure provisioning as well:
├── infrastructure
│ └── equinix-metal
├── clusters
│ └── production
I'm not sure how well that would work with Ansible and/or OpenTofu. Previously, Terraform worked with the Flux TF Controller, but I don't know if there is a similar integration with OpenTofu. I'm also not sure if Flux would be necessary with Ansible since that is already an IaC tool (but I have not worked with Ansible before so I'm not sure). Lots of questions here.
An idea for how we could deploy CNCF Projects:
├── cncf-projects
├── falco
└── <next-project>
Each project could use Kustomize to point to the upstream configuration that is maintained by CNCF Project maintainers. However I'm not sure how/if that works with Ansible configuration. 🤔 The alternative would be to do the self-hosted GitHub Action runners that project maintainers can use directly.
I would be up for taking over this one. Can I get it assigned to me? @nikimanoledaki Should I ask you the github token? I wanted to ask what is the final output: a pull request with all the needed folder structure and the steps followed to install flux would do?
@nikimanoledaki I like that directory structure with the environments and cncf-projects.
Also +1 for having the IaC code under infrastructure I'll add a note to #1. The IaC code will need to bootstrap Flux so we might run into a chicken egg problem but it would be nice to use Flux if we can.
@AntonioDiTuri Thanks, I think a pull request would be good and then depending on where we are with the IaC issue we can see how to integrate both workstreams.
Should I ask you the github token?
This is a good question. I'm not sure how we should manage this! There are risks if we use our own personal access tokens since the token needs repo-wide access. Any leak or sharing with other folks could give access to private repos that the user has access to.
A bot account could be an option. We would need to request this from the CNCF. Maybe there is one already.
Do you have any other ideas? 🤔
We should also think about having multiple environments
Would advocate for, for now all dev.(there is no production now)
We don't use personal access tokens in this project. We will go over the org. I will take a look at this after Kubecon
We should also think about having multiple environments
Would advocate for, for now all dev.(there is no production now)
Regarding this - we currently do have the manual testing workflow (dev) and we will have the automated process later (prod). We could rename these environments if dev/prod is misleading to something like manual/automated. I think it's worth planning for both in our repository structure. What do you all think? :) Let me know if I may be missing or misunderstanding something.
Created this issue to request a PAT and unblock this: https://github.com/cncf-tags/green-reviews-tooling/issues/7
We have a fine-grained PAT - anyone who needs this can message @leonardpahlke or me (and the new leads soon!) 👍
Heads-up that there is some progress on the Falco side thanks to @incertum to create the repo that will contain the Daemonset/ConfigMaps needed to deploy Falco: https://github.com/falcosecurity/evolution/issues/345
After that, we can add ./clusters/falco.yaml
with the following:
---
apiVersion: source.toolkit.fluxcd.io/v1
kind: GitRepository
metadata:
name: falco-cncf-green-reviews-testing
namespace: flux-system
spec:
interval: 1m0s
ref:
branch: main
url: https://github.com/falcosecurity/cncf-green-review-testing
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: falco-cncf-green-reviews-testing
namespace: flux-system
spec:
interval: 30m0s
path: ./kustomize
prune: true
retryInterval: 2m0s
sourceRef:
kind: GitRepository
name: falco-cncf-green-reviews-testing
targetNamespace: falco
timeout: 3m0s
wait: true
Cluster Management
We want to use a GitOps approach for the components running in the cluster using Flux. This is for the minimal set of components that should always be running to support the pipeline.
This is so it is
- Clear to all participants which components and versions are running in the cluster
- Easier to contribute to technical tasks by submitting pull requests
The pipeline is responsible for installing applications that are to be measured e.g Falco
Requirements
The components to be installed are listed in the design doc
Phase 1: Base-level cluster components (MVP)
- [x] Cilium Provision cluster and bootstrap flux #6
- [x] Kepler [Automated] installing kepler using flux #15
- [x] Prometheus test: installing prometheus with flux #12
- [x] Grafana [Automated/Action] Install Kepler dashboard #16
Phase 2: Gather idle metrics for Falco
Phase 3: Gather load-test metrics
- [ ] Synthetic workload Microservices demo #13
- [ ] Load generation tool e.g. k6
More may be added as we continue to develop the pipeline.
Documentation
We should document this process as we go.
@rossf7 you might want to update the issue description for the cilium
We can close this since it is mostly completed. We have the base cluster environment, which is our MVP.
There is an open PR for the microservice demo workload but holding off since we're going to do idle measurements first. Lastly, we can revisit the need for a load-testing tool later on.
Cluster Management
We want to use a GitOps approach for the components running in the cluster using Flux. This is for the minimal set of components that should always be running to support the pipeline.
This is so it is
The pipeline is responsible for installing applications that are to be measured e.g Falco
Requirements
The components to be installed are listed in the design doc
Phase 1: Base-level cluster components (MVP)
Phase 2: Gather idle metrics for Falco
Phase 3: Gather load-test metrics
More may be added as we continue to develop the pipeline.
Documentation
We should document this process as we go.