Add advanced/extended CI for gocql

scylladb / gocql

Package gocql implements a fast and robust ScyllaDB client for the Go programming language.

https://docs.scylladb.com/stable/using-scylla/drivers/cql-drivers/scylla-go-driver.html

BSD 3-Clause "New" or "Revised" License

188 stars 59 forks source link

Add advanced/extended CI for gocql #204

Open sylwiaszunejko opened 4 months ago

sylwiaszunejko commented 4 months ago

Epic to track activities aiming to make the CI more reliable and efficient.

### Tasks
- [ ] make every PR to build [scylla-bench ](https://github.com/scylladb/scylla-bench) from a PR branch and run simple scylla-bench test (https://github.com/scylladb/gocql/pull/255)
- [ ] for more complex PRs run more advanced tests from scylla-bench
- [x] add mechanism for removing old images (https://github.com/scylladb/gocql/pull/269)

sylwiaszunejko commented 4 months ago

@roydahan

sylwiaszunejko commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.
When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets
One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.
There is a pull_request_target event and when using it secrets are available from a fork.
The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)
That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.
As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

fruch commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

I would say this is relatively rare case, and most change would be in code, or in the tests (i.e. sct) And if needed on change temporary change it to pull_request, or maybe keep a copy of the workflow for such testing purposes. if most of the logic of each step would be kept outside of the workflow, it would lower the fraction around this.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it

FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

One option you didn't mentioned, is to ditch github actions, and use Jenkins completely for building the extended CI. (at the end some parts of it would be in Jenkins anyhow, for example SCT test runs), anyhow some part might be that visible to external contributors (at least for now)

dkropachev commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

You forgot to mention that github action does hot have built-in solution to trigger certain workflow based on labels it has. Which means that in order to trigger Advanced Integration Tests workflow if certain label is set we will have to have another workflow that checks if label is there and then run Advanced Integration Tests via gh worflow run <workflow-file>.yml

sylwiaszunejko commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

You forgot to mention that github action does hot have built-in solution to trigger certain workflow based on labels it has. Which means that in order to trigger Advanced Integration Tests workflow if certain label is set we will have to have another workflow that checks if label is there and then run Advanced Integration Tests via gh worflow run <workflow-file>.yml

Not exactly we can have

on:
  pull_request_target:
    types: [labeled, synchronize]

And check if the right label is present, if it is not the workflow will be visible as skipped and I think that is acceptable

dkropachev commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

You forgot to mention that github action does hot have built-in solution to trigger certain workflow based on labels it has. Which means that in order to trigger Advanced Integration Tests workflow if certain label is set we will have to have another workflow that checks if label is there and then run Advanced Integration Tests via gh worflow run <workflow-file>.yml

Not exactly we can have
on:
  pull_request_target:
    types: [labeled, synchronize]
And check if the right label is present, if it is not the workflow will be visible as skipped and I think that is acceptable

In this case every PR will have action result record Advanced Integration Tests, which is going to be confusin.

sylwiaszunejko commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

You forgot to mention that github action does hot have built-in solution to trigger certain workflow based on labels it has. Which means that in order to trigger Advanced Integration Tests workflow if certain label is set we will have to have another workflow that checks if label is there and then run Advanced Integration Tests via gh worflow run <workflow-file>.yml

Not exactly we can have
on:
  pull_request_target:
    types: [labeled, synchronize]
And check if the right label is present, if it is not the workflow will be visible as skipped and I think that is acceptable
In this case every PR will have action result record Advanced Integration Tests, which is going to be confusin.

Still better than running it from the other workflow I believe, but maybe as @fruch said it is worth to consider just using Jenkins and that would solve problems with labeling as well

fruch commented 2 months ago

Summary on current findings:

We need to login to dockerhub to be able to push modified scylla-bench images. To login we need access to repo secrets.

When creating a PR from a branch in a fork and using pull_request github workflow event we are not able to access credentials - makes sense from a security point of view, if PR originates from a repo branch it is possible to use secrets

One way would be for every PR that needs extended CI to move its branch to scylladb/gocql - not really convenient to use. So we need a way to access secrets from forks but add also some security to prevent users from using secrets in other way than intended.

There is a pull_request_target event and when using it secrets are available from a fork.

The idea is to add a requirement that every time someone tries to run extended CI it needs to be approved by a maintainer. Unfortunately pull_request_target event does not respect rules set up in the settings approval section of the repo (e.g. in gocqlx every CI run has to be approved) so this does not work. There is a different way to handle that by using github environment and add maintainers as a required reviewers there (source: https://datachain.ai/blog/testing-external-contributions-using-github-actions-secrets) (I tried to test it with @dkropachev but he was not able to set drivers-team team as a reviewers due to him not being a member of this team)

That sounded like a good enough solution but there is a second problem. We found out that when using pull_request_target it is impossible to edit workflow file and run changed CI in the same PR - which is good when we consider outside contributors, but can be a trouble for a maintainer wanting to edit this files and not being able to actually test it.

As a alternative solution we were considering having extended CI workflow as a part of a merge queue, but it will require having a solution ready to merge before testing it FYI: @roydahan @fruch @dkropachev please let me know if I missed something important from out discussion in this summary.

You forgot to mention that github action does hot have built-in solution to trigger certain workflow based on labels it has. Which means that in order to trigger Advanced Integration Tests workflow if certain label is set we will have to have another workflow that checks if label is there and then run Advanced Integration Tests via gh worflow run <workflow-file>.yml

Not exactly we can have
on:
  pull_request_target:
    types: [labeled, synchronize]
And check if the right label is present, if it is not the workflow will be visible as skipped and I think that is acceptable
In this case every PR will have action result record Advanced Integration Tests, which is going to be confusin.
Still better than running it from the other workflow I believe, but maybe as @fruch said it is worth to consider just using Jenkins and that would solve problems with labeling as well

In the context of labeling Jenkins work identically it not triggered by label, it's checking labels

We can do a label condition on a whole workflow, and it won't show on the PR, see the post to pypi on python-driver

sylwiaszunejko commented 2 months ago

Update about triggering Jenkins job: I tried these actions: https://github.com/mickeygoussetorg/trigger-jenkins-job and https://github.com/appleboy/jenkins-action without success. Finally I manged to trigger Jenkins job using this action: https://github.com/geokats7/jenkins_client. It supports providing parameters, only problem is that waiting for a job result seems to be broken. https://jenkins.scylladb.com/job/scylla-staging/job/sylwia.szunejko/job/longevity-twcs-3h/9/ https://github.com/scylladb/gocql/actions/runs/10791000998/job/29927447954?pr=255

Lorak-mmk commented 1 month ago

Regarding pull-request trigger and secrets: we did have similar problems in Rust Driver when integrating cargo-semver-checks. The job was supposed to add / remove label to PR (which requires permissions), but also work on user code (which could be unsafe).

You can see the current workflow here: https://github.com/scylladb/scylla-rust-driver/blob/main/.github/workflows/semver_checks.yml

The PR that introduced it and comments in the code explain how it works and why it is secure: https://github.com/scylladb/scylla-rust-driver/pull/909

The idea is that we can have several jobs in a workflow (which is triggered by pull-request-target). Some of those jobs operate on untrusted user code by manually checking out head of PR:

    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: "2"
        ref: "refs/pull/${{ env.PR_ID }}/merge"

They don't have write permissions (permissions: {}), because otherwise malicious PR could use write permissions in unintended way. Output of such job is passed to $GITHUB_OUTPUT, which is then used by job with write privileges to perform some privileged actions (in Rust Driver: post comment, add or remove label).

Regarding the testability issue of CI when using pull-request-target: when we need to make some changes to this workflow, we create a branch, let's call it X, in the main repo, with the changes for this workflow, and then make some PR to X from a fork - this PR runs the updated CI.