cncf-tags / green-reviews-tooling

Project Repository for the WG Green Reviews which is part of the CNCF TAG Environmental Sustainability
https://github.com/cncf/tag-env-sustainability/tree/main/working-groups/green-reviews
Apache License 2.0
23 stars 14 forks source link

feature: add release fetcher and dispatcher #104

Closed dipankardas011 closed 3 months ago

dipankardas011 commented 3 months ago

What type of PR is this?

kind/feature

What this PR does / why we need it:

TBD

Which issue(s) this PR fixes:

Fixes #98

Special notes for your reviewer (optional):

dipankardas011 commented 3 months ago

Questions

Plans

will using ${{ vars.<> }} to get gh variable once we get the version will do a curl request to the following repo to get the current latest version if we need to update it will update that via github rest api and once done need to trigger the following pipeline for respective projects

for the dispatch we need to call directly via the api call and then we can put concurrency with the group which can be same for the the subsequent project benchmarking tests

rossf7 commented 3 months ago

How are we planning to store what all projects are supported? (is it some environment variable or a file as a variable in repo?)

I think we should use a file. I included an example file in the proposal in the subscribing to releases section but it may need more data such as the GitHub repo URL . https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#subscribing-to-releases

For the location I think it should be /projects/project.yaml? As underneath the projects dir we will have the per project manifests. e.g. /projects/falco/ebpf.yaml WDYT?

found a hard limit for github variable is 500 only

Good to check this. However if we use 1 variable per project I don't think we'll ever hit this limit. Or if we do hit it our automation will need to be more sophisticated and later we can switch to something else for managing state.

how to update the github repo variable

Can you use the REST API?

dipankardas011 commented 3 months ago

Can you use the REST API? Forgot to update was able to work with that just that we need to use the same curl command to retrive the variable

planning to use python script to as much as possible no dependency install

dipankardas011 commented 3 months ago

For the location I think it should be /projects/project.yaml? As underneath the projects dir we will have the per project manifests. e.g. /projects/falco/ebpf.yaml WDYT?

SGTM, I also want to specifiy the organization of the project as well

dipankardas011 commented 3 months ago

@rossf7 Can you create a PAT token for only this repo so that we can see if it works or not?

dipankardas011 commented 3 months ago

Here are the variables I am looking for

we can still change the naming things ;)

secrets.REPO_ACCESS_TOKEN: for read and write of the repo variables vars.FALCO_VERSION: some random string as value vars.KEPLER_VERSION: some random string as value

rossf7 commented 3 months ago

planning to use python script to as much as possible no dependency install

@dipankardas011 Reasoning makes sense but as discussed we need agreement that python is the higher level language we want to use. I'll discuss with the other leads.

Can you create a PAT token for only this repo so that we can see if it works or not?

I don't have permission to create tokens. Can you test with your fork for now? Once we're aligned on the python question we can request the token.

dipankardas011 commented 3 months ago

I don't have permission to create tokens. Can you test with your fork for now? Once we're aligned on the python question we can request the token.

Tested it works 😁

rossf7 commented 3 months ago

Here are the variables I am looking for

secrets.REPO_ACCESS_TOKEN: for read and write of the repo variables
vars.FALCO_VERSION: some random string as value

For REPO_ACCESS_TOKEN we have an existing secret called FLUX_GITHUB_TOKEN. I think we should rename this to GITHUB_TOKEN and we can use it for both tasks.

https://github.com/cncf-tags/green-reviews-tooling/blob/main/.github/workflows/tofu.yaml#L23

FALCO_VERSION yes we will need to add this variable. I checked the endpoint and unfortunately there is no "upsert" support so this will be a setup task when we onboard new projects which we'll need to document.

dipankardas011 commented 3 months ago

Design wise my main question is how the deploy step will be triggered? I think we should use the REST API as its just another endpoint to call.

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice not sure if it answers your question :)

dipankardas011 commented 3 months ago

I checked the endpoint and unfortunately there is no "upsert" support so this will be a setup task when we onboard new projects which we'll need to document.

📓 Keep a note

dipankardas011 commented 3 months ago

For REPO_ACCESS_TOKEN we have an existing secret called FLUX_GITHUB_TOKEN. I think we should rename this to GITHUB_TOKEN and we can use it for both tasks.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms?

I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for

also general aka all in one token doesn't seems correct as each token should have very limited scope

rossf7 commented 3 months ago

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice

@dipankardas011 This should be implemented via a workflow_dispatch event as was agreed in the proposal. https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#trigger

The 3 inputs mean we know the project, version and which falco driver to deploy.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms? I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for also general aka all in one token doesn't seems correct as each token should have very limited scope

Good point. The current token doesn't have permission and better to scope it to the needed permissions.

So we will need to request a new token.

dipankardas011 commented 3 months ago

Yes I thought about it but then another problem cam arise as we are directly calling a different workflow we can't know for sure which parent called the child workflow. so thought to give workflow_call method as I have used previously and it works nice

@dipankardas011 This should be implemented via a workflow_dispatch event as was agreed in the proposal. https://github.com/cncf-tags/green-reviews-tooling/blob/main/docs/proposals/proposal-001-trigger-and-deploy.md#trigger

The 3 inputs mean we know the project, version and which falco driver to deploy.

Does FLUX_GITHUB_TOKEN contains repo:variable:write perms? I would say lets keep it specific to perms and role for that PAT token so that we can know what is it used for also general aka all in one token doesn't seems correct as each token should have very limited scope

Good point. The current token doesn't have permission and better to scope it to the needed permissions.

So we will need to request a new token.

Okay so you meant instead of calling like Uses: ./.GitHub...... We should use the GitHub api call instead to call the workflow dispatch ?

May be it's not that different still if we do it we might be better of calling the workflows from the go script itself

rossf7 commented 3 months ago

Okay so you meant instead of calling like Uses: ./.GitHub...... We should use the GitHub api call instead to call the workflow dispatch ?

May be it's not that different still if we do it we might be better of calling the workflows from the go script itself

Yes, its similar but we can call the workflow via this endpoint

https://docs.github.com/en/rest/actions/workflows?apiVersion=2022-11-28#create-a-workflow-dispatch-event

So all the logic is contained in the go script.

dipankardas011 commented 3 months ago

@rossf7 should we add context as well for each fmt.Errorf to get which component failed ?

I mean error wrapping

leonardpahlke commented 3 months ago

wem need to add some quality checks to the repo. this can be done in another PR. golint gosec etc.

leonardpahlke commented 3 months ago

@dipankardas011 i will propose some edits in a PR and open it towards your repository.

leonardpahlke commented 3 months ago

@dipankardas011 what is the expected output of the program? Can you write an outline. My assumption, it takes a list of GitHub repositories & outputs a list of latest releases? a release has a bunch of information which kind of information is required?

leonardpahlke commented 3 months ago

Alright, I was looking at the code. And also the plan we worked on so far.

My understanding, looking at the code.

  1. load a project manifest json from the green-reviews repo which contains the information about all projects that we like to deploy. the file reference can be passed as flag. --repo-manifest https://github.com/cncf-tags/green-reviews-tooling/blob/main/repo-manifests.json
{
  "falco": {
    "repo": "https://github.com/falcosecurity/falco"
  }
}
  1. loop over all repositories and ?

We need the helm charts or some other commands that we can execute to deploy Falco. Falco's helm charts seem to be located here https://github.com/falcosecurity/charts/tree/master/charts/falco if there is a new Falco release, we would deploy the helm chart. So the information we need is a bit more.

{
  "falco": {
    "v0.38": { // config used for version v0.38 or newer unless a higher version is specified. in case the deployment changes by release
      "repo": "https://github.com/falcosecurity/falco",
      "deployment": [
        "helm repo add falcosecurity https://falcosecurity.github.io/charts",
        "helm repo update",
        "helm install falco falcosecurity/falco --create-namespace --namespace falco"
      ],
      "verify": [],          // to check if the deployment succeeded
      "collectdata": [],     // idk how this is done. maybe automatically
      "destroy": []          // delete any deployed resources
    }
  }
}

You are right, @dipankardas011 we need some kind of storage to keep track of the latest release measured by project. A repository variable could do the trick "falco:v0.38,xyz:v0.1".

Right now, if i read the current plan right (plan), we have one pipeline per project (falcosecurity/falco) right now and use the workflow-dispatch action to trigger our deployment pipeline. Perhaps it's easier to trigger the deployment pipeline once a day. Don't have multiple pipelines. Don't scatter pipelines across repositories. We do not need to measure the minute a release is cut. This would make it easier. We can always manually trigger the pipeline if we wish to do so or increase the cadence. IMO, I would try to reduce complexity as much and involve more and more folks in a process only if that is really necessary.

leonardpahlke commented 3 months ago

This could be done in a bash script.

rossf7 commented 3 months ago

@leonardpahlke Thank you for reviewing. Adding my thoughts on these design changes.

Step 1: yes, the release trigger is to check if there is a new release of Falco. It should call the deploy workflow being added in #105

This was added so we could subscribe to releases without projects needing to call our deploy workflow from their CI/CD. Although they can do that if they wish.

Step 2: The Falco team wished to deploy using Kustomize and created this repo. https://github.com/falcosecurity/cncf-green-review-testing

There are 3 Falco drivers they wish to test. For #100 my plan is to create an overlay per driver using kustomize.

This could be done in a bash script.

Yes, when I talked with Niki and Antonio our preference would be to use bash and only use go if absolutely necessary.

I like your idea of always triggering the pipeline once a day for the latest release. This would be useful for testing since Falco releases do not happen often and mean we don't need to store state.

I think simplifying the design so we can use bash is the best way to move forward.

rossf7 commented 3 months ago

@leonardpahlke @dipankardas011 I've been thinking more on how we could simplify the design. As I think I made it more complex than was necessary :(

Instead of the projects.json I think we could use inputs and default to the falco values.

Dipankar, I think your previous suggestion to use workflow_call instead of workflow_dispatch to call the deploy workflow is better and mean a PAT is not needed to trigger the deploy.

WDYT?

dipankardas011 commented 3 months ago

Dipankar, I think your previous suggestion to use workflow_call instead of workflow_dispatch to call the deploy workflow is better and mean a PAT is not needed to trigger the deploy.

Sol 1> For that we can make the go script which writes a file in json and parse and set environment variables

somthing like this (previous impl)

jobs:
  get-latest-version:
    runs-on: ubuntu-latest
    outputs:
      is-falco-updated: ${{ steps.check-updates.outputs.FALCO }}
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v5
        with:
          go-version: '^1.22.3'

      - name: build the binary
        # env:
        #   GH_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
        working-directory: cmd
        run: |
          go get -d
          go build -v -o ../proj-trigger .

      - name: trigger updated projects
        # env:
        #   GH_TOKEN: ${{ secrets.REPO_ACCESS_TOKEN }}
        working-directory: cmd
        run: |
          go get -d
          go build -v -o ../proj-trigger .

      - name: Dispatch
        id: check-updates
        env:
          file: /tmp/updates.json # generated by go program
        run: |
          if jq -e --arg val "falco" '.proj_names | index($val) != null' "$file" > /dev/null; then
            echo "Element '$value' found in the proj_names array."
            echo "FALCO=true" >> $GITHUB_OUTPUT
          else
            echo "Element 'falco' not found in the proj_names array."
            echo "FALCO=false" >> $GITHUB_OUTPUT
          fi

  falco-proj:
    uses: .github/workflows/falco.yml
    needs: ["get-latest-version"]
    if: ${{ needs.check-updates.outputs.is-falco-updated }} == 'true'
    secrets: inherit
    with:
      version: ${{ vars.FALCO_VERSION }} # need to check if it gets the updated value

Sol 2> We can do this way to have a seperate bash script or another option which is go will write the bash script which contains these echo "FALCO=false" >> $GITHUB_OUTPUT and we will just perform source and make a command like set_updated_projects_envs() in the bash

 go run ......
 # next step
 source ~/${some source} # written by go
 set_updated_projects_envs()

by this the envs for these will be written I beloeve to the $GITHUB_OUTPUT

other than that not sure if we should move entirely to bash scripts (not very comfortable[would need help]) and remove go script. Still decision on you both

dipankardas011 commented 3 months ago

Also I have made to many files than necessary (actually was trying to undertand leo's inputs). may be that looks too much complicated

dipankardas011 commented 3 months ago

@leonardpahlke I have closed this PR and created a new one as its will potentially cause review problems so now Using Bash script with droped Idea of project version to github variables and instead to call it everyday to perform for all projects irrespective of if there is or not there any new version

105