metal-stack / gardener-extension-provider-metal

Implementation of the gardener-extension-controller for metal-stack
MIT License
24 stars 11 forks source link

Audit to splunk #191

Closed mreiger closed 3 years ago

mreiger commented 3 years ago

Pass a config file to the auditforwarder fluent-bit that forwards copies of all audit events to splunk. (Also set kube-apiserver service externalTrafficPolicy to local so that the clients' real ip addresses appear in the audit events.)

majst01 commented 3 years ago

Why do we need another audit forwarding to a very specific implementation of log backend. Does the actual log forwarding not work, and if not why not fixing this

mreiger commented 3 years ago

The mechanism for forwarding the audit data to splunk was changed: Now we pass a config file to the fluent-bit in the auditforwarder that sends copies of the audit events directly from the seed to splunk. We do need the splunk connection because of compliance requirements for our gardener-operated clusters.

majst01 commented 3 years ago

After some internal discussion i would like to clarify the following aspects:

mreiger commented 3 years ago

After some internal discussion i would like to clarify the following aspects:

* logging to splunk should be configurable per shoot in terms of:

  * enabled/disabled
  * target host
  * credentials

* this must be possible also for already created shoots

* by default we should only consider prod clusters (aka purpose)

Current state of discussion:

* what happens with the logs if the splunk endpoint is not reachable? Spool to local PV or drop these events. If spooling how much space should be held for spooling ? What happens to audit logs if spooling area is full?

Auditforwarder buffers into memory; the current version has implemented a memory buffer limit; once this is full, further audit events will be dropped.

The default memory buffer limit is 200 Mbyte which is enough for some hours of log files.

Also there auditforwarder container now gets deployed with limits so that it can not grow indefinitely anyway.

I hope this addresses most concerns.

Gerrit91 commented 3 years ago

Today we had a discussion on how we could achieve logging on a per-shoot configuration basis + falling back to a default audit logging. I want to keep these ideas for future reference.

Technically possible:

Not possible:

mreiger commented 3 years ago

Record of the decisions made today: