Open eshepelyuk opened 1 year ago
Hello @opalmer
I would like to propose a different view on the solution of the issue you've described. It is my personal preference because I assume it's easier to implement.
That's pretty much exactly what I was going for, only difference would be a step 6:
kube-mgmt
side car will start but not run the kube-mgmt
binary until opa is up and has been loaded with policies.This could be maybe be accomplished by sharing a volume between the two containers and having kube-mgmt
in the opa container writing out a file after the rules are loaded. Or kube-mgmt
could simply attempt to POST to rules to opa until it's ready (all the while kube-mgmt
's health check would fail until that's successful).
That's pretty much exactly what I was going for, only difference would be a step 6:
kube-mgmt
side car will start but not run thekube-mgmt
binary until opa is up and has been loaded with policies.This could be maybe be accomplished by sharing a volume between the two containers and having
kube-mgmt
in the opa container writing out a file after the rules are loaded. Orkube-mgmt
could simply attempt to POST to rules to opa until it's ready (all the whilekube-mgmt
's health check would fail until that's successful).
This is already implemented in #210 I.e. kube-mgmt is not started until OPA readiness check passed.
@opalmer are you able / willing to contribute ?
I can try and give it a shot yeah, may not be immediate but given this is an issue I've come across in production I should be able to make time for it.
I'll dig into the code base and see if I've got questions about where to start. If you have any pointers though in terms of the bits I'll want to look at first, that would be helpful.
I can try and give it a shot yeah, may not be immediate but given this is an issue I've come across in production I should be able to make time for it.
I'll dig into the code base and see if I've got questions about where to start. If you have any pointers though in terms of the bits I'll want to look at first, that would be helpful.
Hello
I 'll be very appreciated if you can try to address this.
I am not golang dev, so wont suggest any useful, I am usually doing in trial and error mode. From testing perspective - take a look at justfile, you can find command there to setup local k8s cluster, build and test project.
Also, from Helm side you will need to create unit tests for chart rendering (test/unit) and e2e tests (test/e2e),we can discuss scenarios later.
No worries that's enough to start with, thanks!
I dove into some of the code yesterday when trying to track down exactly what the management container was doing so I'm familiar enough that I can start with that I think. If you had some specifics that would be great but I spend most of my day in Go so I can dig around. Pointers to those tests is useful though!
Hi @opalmer, I guess you are working on this issue. we are also waiting for this fix.
Hello @opalmer, created this issue from your comment in #211
I came to this issue from https://github.com/open-policy-agent/kube-mgmt/issues/207 which was marked as a duplicate of https://github.com/open-policy-agent/kube-mgmt/issues/189. After reading through the initial description, I'm not sure that this issue (#211) is going to address #207. I do agree that liveness probes that automatically restarts the management container would be an improvement but I don't think it will solve #207.
Some specific scenarios in which this approach might not work:
Even if it were less than a second that opa didn't have policies loaded, hundreds of requests could get through without being run through the proper policies. For something like opa where a policy could be blocking privileged containers, ensuring images can't come from an unknown registry, ensuring pods end up on the right nodes, etc this can have some major side effects from both a security and an operational perspective.
Now, I've tried thinking of ways to work around the current state:
The extra argument path might work except the docs state:
.. so that would mean you can't really use remote bundles and the current approach with the REST API? Looking around a bit more, I found #76 which is talking about adding a bundle api. Then after reading the ensuring operational readiness docs:
So.... that got me thinking. What if there was an init container, or an entrypoint for the opa container, that either runs a subcommand of the management container command or reaches out to the management container (may have to wait for it to be up) and pulls down the policies and drops those on disk before opa starts?
This way when opa comes up, all the policies are pre-loaded and it can't serve a request without those policies. The management container could then continue as normal pushing policies as they are updated in kubernetes. This would also have an advantage that you still have a single source of truth (config maps) and if there's something else wrong (k8s api issues, kubelet problems, etc) your pod will never be in a ready state. This should also tightly couple the chain of events leading up to opa starting so no matter how the pod dies or how the chart is configured it should always load up the policies you've defined every time.
... at least that's the idea. I'll admit, I'm not an OPA expert so I could be missing a glaring issue with this idea somewhere. If that's the case, happy to learn something new haha :smile: