Open hennersz opened 3 years ago
I like the idea, so long as the code path can be turned off to avoid the performance hit for those who don't use it.
I'm assuming that the call to the logger would be blocking?
Would this be required to be a unix domain socket? Or could it also take a URI? I could also see people wanting to use something like gRPC to lower serialization costs.
What the configuration for this looks like is the most vital part IMO, as the rest can likely be changed silently if need be.
+1
This approach was also discussed briefly when we were looking at metrics, events, and crds for violations. Definitely interested in seeing a more detailed design for what this might look like, deployment model, and the user experience. Note this should work for both admission denies and audit violations.
One question I have is what was the drawbacks/issues you hit with using something like a fluentbit sidecar to stream the gk logs?
Yea, this of course would be something that is not enabled by default, much like emitting events or logging denies to stdout must be explicitly enabled via a cli flag.
Initially it would probably be a blocking call for simplicity but I would like to try streaming a bunch of decisions to the sidecar in a non blocking way so this logging method does not impact performance too much.
Don't want to restrict it to only working over unix domain sockets. This means if people don't want to deploy their exporter as a sidecar, but say as a separate deployment this is possible. In golang at least it is really easy to swap a unix connection with a network connection since they use the same interface. However the transport implementation is supposed to be offloaded to the sidecar. The connection between GK and the exporter only has to be performant (low latency, low cpu/ram), provide data integrity and support a variable data shape. Unix sockets provide this. A network connection could be swapped out in place of the unix connection since they provide the same interface but you don't get same guarantees around performance or data integrity, depending on if you are using a packet or stream based connection. So our focus would be on running the exporter as a sidecar and design decisions would be based around this assumption
Between GK and the sidecar i'm not sure we even need something like gRPC, since we control both ends of the connection, and are only sending one datatype. So if we say “use a datagram unix socket” that is GK -> exporter then if we want REST, or gRPC, or some other wire protocol (kafka, mongo), these are implemented in the exporter. The transport abstraction is intended to be the exporter. The unix socket is essentially a way for GK and the exporter to share memory, without having the exporter logic in the gatekeeper codebase
Regarding the fluentd sidecar, it's not an option I have explored extensively, however a couple of issues I can see are, with the decision logs going to stdout they are mixed with the application logs from gatekeeper, and really we would want to handle these separately. So then we also need to start filtering and forwarding the logs to different places, in different manners.
Here are some further details around what we plan to implement
An important aspect of the design of this is how the client can be configured, such that the client can be integrated into the gatekeeper codebase and then easily configured at runtime with, for example different transports without the interface of the client having to change at all. By default this exporter should not be used at all, so as not to affect the current gatekeeper functionality. In golang/pseudo code it would look roughly like this
type DecisionLog struct {
//decision log properties
}
type ExporterClient struct {
//client properties
}
type Config struct {
//config properties
}
type exporterTransport interface [
export(dl DecisionLog)
}
func NewExporterClient(config Config) *ExporterClient {
switch config.SidecarURL.Scheme {
case "unix":
return &ExporterClient{newUnixTransport(config)}
case "grpc":
return &ExporterClient{newGRPCTransport(config)}
}
}
func (ec *ExporterClient)Log(ar AdmissionReview, result OPAResult) {
dl := newDecisionLog(ar, result)
ec.transport.export(dl)
}
This gives us the flexibility to reconfigure or add new transports to the client without having to make changes to how the client gets used in the rest of the codebase. So initially we would add 1 transport for the sidecar, but if people feel that having say 1 exporter shared between multiple gatekeeper instances and they want the communication between these to be over gRPC it shouldn't be too hard for this to be added as a new transport and adding the required configuration options.
The config can be mounted into the gatekeeper container as a config map, and the path of the config map is passed to gatekeeper as a CLI arg. If the arg is supplied gatekeeper knows to initialize the exporter client and call the exporters log function during the webhook or audit processes. If the arg is not supplied these code paths are skipped entirely.
The initial API will consist of 1 endpoint. We will define a new type as part of gatekeeper that will include the fields below.
{
"timeStamp":"2009-11-10T23:00:00Z", type RFC3339 string
"decisionId": "3058baaa-7c60-4b57-a634-77f7d9ee7b0f", type uuid
"message":"Deployment nginx-web-server allows the following containers to run as root: web-server", type string
"eventType":"audit_violation", type string "audit_violation"|"webhook_violation"
"constraintKind":"RunAsNonRoot", type string
"constraintName":"run-as-non-root", type string
"constraintAction":"dryrun", type string "dryrun"|"deny"
"resourceKind":"Deployment", type string
"resourceNamespace":"website", type string
"resourceName":"nginx-web-server", type string
"input" object, type object
}
The way we plan to deploy the export server, as a sidecar, means that this is not a scenario where we expect a lot of reasons for failure. Running as a sidecar, the server will bind to a unix domain socket. Gatekeeper will connect to this socket. As this is all done over the unix socket which is in memory there is no chance of things like packet loss that you would get if the server was deployed remotely. The main area we could see errors is if gatekeeper can overwhelm the server with too many decision log events. To combat this we will design the receiving server with high enough performance to receive at the rate gatekeeper can produce events. We can also tune container limits within k8s as well. Although gatekeeper is limited by the network for the rate at which objects are submitted to it, because a particular object could fail multiple policies, this will generate multiple events. This means gatekeeper can generate events faster than it receives input, which is why we do have to consider the performance of the server relative to gatekeeper.
Pods get rescheduled from time to time. When this happens the pod will be killed and restarted on another node. This means we lose anything in memory. Pods have a grace period so when they get terminated they have an amount of time to finish up. When pods are scheduled for termination they are removed from a services endpoint list, so this gives our pods time to finish transferring any events from gatekeeper to the exporter sidecar without having to worry about new events coming in. It is the sidecars job to ensure these events are persisted in a way that will survive restarts, i.e. to disk.
If a node is suddenly powered off, killing all pods immediately we will lose any events that were created by gatekeeper but not transferred to the sidecar or written to disk in the sidecar. The number of events in this state is likely to be very small, assuming our sidecar is running correctly. This is also a very unlikely event. For the audit pod this is not a problem as it will get rescheduled and can continue its audit. For the webhook, some events can be lost where the webhook completes, sending a deny back to the API server, but the deny reasons don't all get transferred to the sidecar, assuming sufficient performance from the sidecar this should have such a small window of time to not be a problem.
If the external system goes down for short periods of time our exporter will be able to handle this. The decision logs themselves will continue to be stored on disk and the thread that connects to the external system will continually rety sending the head of the queue with exponential backoff. Depending on the size of disk allocated to the exporter and the volume of events that come in this could be fine for hours or even days but eventually the disk will fill up. Our goal in this case should be to allow enough time to diagnose and fix why the external system has gone down by setting up alerting around failures to send the decision logs and disk usage.
Our plan for the first exporterTransport is to open a single unix socket stream between gatekeeper and the exporter side car. The data will be encoded as JSON, and each object will be delimited by a newline, i.e. ndjson. This allows us to easily encode, send and decode the decision logs as one stream between gatekeeper and the sidecar, without the overhead of sending something like an http header as well. Since we control both ends of the connection, it is not going over the network so auth is not required and we do not need to multiplex multiple endpoints over one connection there seems to be little reason to use some higher level RPC protocol. Just stream the data over the connection and the other end will know what to do with it.
Once gatekeeper has transferred a the decision log to the exporter, we expect our exporter to guarantee at least once delivery of each decision log to the external system eventually. It must be resistant to both the external system being down and the pod itself being restarted. To facilitate this decision logs will immediately be stored in an on disk queue when they are received. A separate thread will then peek at the head of the queue and attempt to forward it to the external system, retrying as necessary until it succeeds, where it will then pop the exported decision log off the queue and start trying to export the next one.
Thanks for this! I don't have time to give this a close reading today, but def. want to follow up. I'm wondering if we could turn this into a google doc or a PR to make commenting easier (we've generally been using Google docs).
I'm also wondering if maybe we can make this exporter model generic and use it as a way to implement semantic logging, which is currently just logging via the standard log library.
Using this as a general intermediary would make it easier to manage policy-relevant data -- controlling which information gets logged where. It would also make it clear which messages are "special" and subject to backwards compatibility requirements.
Hi appologies for the delay. I have moved my comments above to this google doc: https://docs.google.com/document/d/1_5fv_pIxWkBNP-lbWQ6LxeyKUaeMqNUnxjd7CZXEhMc/edit?usp=sharing
Regarding your point about semantic logging, are you suggesting to move parts of the code base like this: https://github.com/open-policy-agent/gatekeeper/blob/master/pkg/audit/manager.go#L770 to the exporter code, to make sure the stdout logs are consistent both between exporting and logs from say the webhook vs the audit controllers?
Hi @hennersz , this is an interesting discussion but one point is not clear to me, the goal behind this ticket is to find a flexible and reliable way to export violation audits from GK to external endpoint API or to stream it i.e. Kafka, correct ?
if so I would not consider getting data from GK logs a reliable option cause these logs have no contract of the logged violations and there is no guarantee it won't change, rather I'd consider exporting the violation events from constraint custom resources.
I've been thinking in building an operator to track constraint custom resource violations and export them periodically, but still I don't feel it's a flexible approach due to the different constraint CRDs or even reliable cause there could be a limitation of how many violations a constraint custom resource can hold before ignoring the rest.
One correction, there is a contract WRT certain log lines:
https://open-policy-agent.github.io/gatekeeper/website/docs/audit#audit-logs
@halkalaldeh really the idea is not to take the data from the standard out logs, but have a separate bit of code in gatekeeper sending decision logs to a sidecar container, the logic for exporting to an API or kafka etc, is implemented in the sidecar so that logic is in a separate project to gatekeeper itself, so the gatekeeper team isn't having to maintain exporters for every different sink people may want to use. We are currently working on a poc of this internally, and are planning to release back after further testing
One issue with reading the constraint violation is you have to set a limit on the number of violations that will be recorded in the constraint and for our use case we don't want to lose violations
Thanks @hennersz and @maxsmythe for your replies, yep it makes sense now.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
some suggestions: 1) I'd personally prefer a file on a filesystem over a pubsub at the start, as discussed in #2193 – it's simpler to implement and to use, and at least in my case audit reports are of concern more than webhook controller logs
2) these messages should include details
field, whatever format they are in
I'd personally prefer a file on a filesystem over a pubsub at the start, as discussed in https://github.com/open-policy-agent/gatekeeper/issues/2193 – it's simpler to implement and to use, and at least in my case audit reports are of concern more than webhook controller logs
No objections to starting with a file system... I don't think this is incompatible with adding pubsub later.
these messages should include details field, whatever format they are in
"details" as in the violation[{"msg": "some bad thing", "details": {"thing": "bad"}}] {}
signature that constraint templates reports? If so, 100%
@maxsmythe yup, details as in violation details :)
@stek29 are you still interested in writing the audit report to the filesystem? if so, do you want to start with a design proposal to get the work started?
past issues: #1532 #897 #2193 #1041
@ritazh I am still interested, but I don't know how to proceed with that.
@stek29 That's great! Perhaps start with a design doc with your proposal, considerations, options. You can take a look at existing design docs: https://github.com/open-policy-agent/gatekeeper/tree/master/docs/design
This is sort of a follow up to some of the issues discussed in #898 . We were discussing the need for a reliable decision log channel and the challenges that brings up. The two existing ways of getting this information outside of gatekeeper are events or stdout logs. Both of these can be susceptible to loss of events, often influenced by systems other than gatekeeper itself. E.g. the kube api will drop events if it is overloaded. Instead we need a reliable method for exporting decision logs that does not have to contend with or can be affected by other workloads in a cluster.
One concern that was brought up in the previous issue was having too many ways of reporting decision logs, such that it becomes unsupportable. People will likely have different requirements around performance, resource use, reliability and will want to send the decision logs to different sinks e.g. a REST endpoint or Kafka. Baking all this functionality into gatekeeper itself seems unwise.
Instead we define an API and a client in gatekeeper for sending decision logs. People can then develop their own exporter programs to handle forwarding decision logs to wherever they want. The exporter is deployed as a sidecar to gatekeeper. Gatekeeper connects to the sidecar through a unix domain socket and sends the decision logs to the sidecar.
We plan to add these changes to gatekeeper and produce a reference side car implementation, but wanted to hear your thoughts on this idea as well.