google / kctf

kCTF is a Kubernetes-based infrastructure for CTF competitions. For documentation, see
https://google.github.io/kctf/
Apache License 2.0
666 stars 73 forks source link

Need to recreate the cluster during every test run #163

Open sroettger opened 3 years ago

sroettger commented 3 years ago

Some changes force us to recreate the cluster we're using in the workflow test.

For example the recent change from kube-system to kctf-system. Not recreating the cluster will mean we're having stale resources that can make it misbehave (e.g. a 2nd copy of the operator kube-system).

When running a test on one cl on branch A and then another on branch B, the two branches might have conflicting cluster changes and the 2nd test might either be a cluster upgrade or a cluster downgrade.

It sounds we would have to delete and recreate the cluster with an option to skip the step if the cluster has a version from which we know how to upgrade.

Similarly, there might be other objects in GCE left over from a different run (service account, load balancer) that could lead to issues. But I don't know if there's a good way to completely clean the project. But this might be less likely to make issues (?)

@sirdarckcat thoughts? so far it was okayish to handle this manually. But it can lead to issues in the future especially if two people develop at the same time.

sirdarckcat commented 3 years ago

you mean if you two people develop on the same cluster at the same time? (like two sysadmins?) or you and me.

if it's you and me, I think this wouldn't be a problem if we run on our own fork?

sroettger commented 3 years ago

If two kCTF devs work on different branches, the github workflow will use the same cluster to test both changes. For example, if you would push a change to alpha now, the tests will probably fail since the running cluster is created using the beta scripts.

sroettger commented 3 years ago

Even if we recreate the cluster every time, that could lead to race conditions if the actions run in parallel. I guess we would have to create a temporary cluster every time?

sirdarckcat commented 3 years ago

Right, could we do this on our own forks instead of branches?

sroettger commented 3 years ago

You mean do a new fork for every change and set up a cluster for the workflow?

sirdarckcat commented 3 years ago

Yea, you would have a cluster and I would have one on our own forks. And we just keep our forks up to date to master.

sroettger commented 3 years ago

That doesn't help if you want to make a change to alpha (e.g. an important backport) while you're working on a beta commit.

On Fri, Dec 11, 2020 at 11:24 AM sirdarckcat notifications@github.com wrote:

Yea, you would have a cluster and I would have one on our own forks. And we just keep our forks up to date to master.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/kctf/issues/163#issuecomment-743110782, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEM2JHRAC6V36BQ4EOFMFLDSUHXPBANCNFSM4UTTXJWA .

sirdarckcat commented 3 years ago

Ah, we can make another project for alpha then. I don't think we need to maintain it for too long though, right?

sroettger commented 3 years ago

Alpha is just an example. This happens for any cluster change that is breaking either forward or backwards

On Fri, Dec 11, 2020 at 11:43 AM sirdarckcat notifications@github.com wrote:

Ah, we can make another project for alpha then. I don't think we need to maintain it for too long though, right?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/kctf/issues/163#issuecomment-743119729, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEM2JHW6SCKMPTAGYB25MITSUHZVBANCNFSM4UTTXJWA .

sirdarckcat commented 3 years ago

I mean, to solve this problem:

When running a test on one cl on branch A and then another on branch B, the two branches might have conflicting cluster changes and the 2nd test might either be a cluster upgrade or a cluster downgrade.

We could stop using branches for development and use forks instead. This would leave us with just two branches, alpha and beta, and for alpha, the proposal would be to create another project for it.

For solving the general problem of updates that make backwards-incompatible changes to the cluster, then we need to delete and recreate the cluster indeed. It would be nice if we had an upgrade path, but I think that's a lot of work and unlikely to be useful to maintain (because the normal usecase is people have a cluster for a CTF, and can create a brand new one with the next release of kCTF for the next CTF)

sroettger commented 3 years ago

On Fri, Dec 11, 2020, 12:06 sirdarckcat notifications@github.com wrote:

I mean, to solve this problem:

When running a test on one cl on branch A and then another on branch B, the two branches might have conflicting cluster changes and the 2nd test might either be a cluster upgrade or a cluster downgrade.

We could stop using branches for development and use forks instead. This would leave us with just two branches, alpha and beta, and for alpha, the proposal would be to create another project for it.

But you have the same problem just that you changed t the word branch to the word fork/cluster. Unless if you spawn a new cluster for every workflow run.

For solving the general problem of updates that make backwards-incompatible changes to the cluster, then we need to delete and recreate the cluster indeed. It would be nice if we had an upgrade path, but I think that's a lot of work and unlikely to be useful to maintain (because the normal usecase is people have a cluster for a CTF, and can create a brand new one with the next release of kCTF for the next CTF)

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/google/kctf/issues/163#issuecomment-743130437, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEM2JHUIAOWJIL2NBFAVMXDSUH4LVANCNFSM4UTTXJWA .

sirdarckcat commented 3 years ago

Yes, if we don't support updates, we have to create a new cluster on every push :-). I imagine we need to have forks to support that dev workflow.

sroettger commented 3 years ago

It's not just updates. You need to handle downgrades. See my initial issue description:

When running a test on one cl on branch A and then another on branch B, the two branches might have conflicting cluster changes and the 2nd test might either be a cluster upgrade or a cluster downgrade.

It sounds we would have to delete and recreate the cluster with an option to skip the step if the cluster has a version from which we know how to upgrade.

sirdarckcat commented 3 years ago

When running a test on fork A and then another on fork B, the two forks might have conflicting cluster changes but it doesn't matter because they run on different forks, and so, on different clusters. The 2nd test might either be a cluster upgrade or a cluster downgrade, which doesn't matter too much because they are running on different clusters.

When we commit these to github.com/google/kctf then we have to create a new cluster :)

It sounds we would have to delete and recreate the cluster every time.

sroettger commented 3 years ago

Btw, how do you create a cluster for your fork? Isn't that lots of manual work? And keeping one cluster per fork running also seems a bit wasteful :)

sroettger commented 3 years ago

I like your idea of just running tests in KIND on a push. And only run the full expensive test when merging to the main branch. I think that's a good compromise between testing and speed.

sirdarckcat commented 3 years ago

you would have to generate a project, key and such "secrets", and then it uses that instead of yours. GKE_PROJECT, GKE_EMAIL, GKE_KEY, GCR_EMAIL, GCR_KEY, and GCR_PROJECT. you probably don't need the GCR ones, though (I think those are for pushing the canonical images)

it is wasteful, indeed.. and probably expensive for volunteers :/