openshift / origin-aggregated-logging

139 stars 230 forks source link

Fluentd send audit logs to Elasticsearch #1226

Closed skunkr closed 6 years ago

skunkr commented 6 years ago

Hello, I am trying to figure out how could I parse the log file, and sent it to Elasticsearch, considering the log is the new json format. I have added the following in master-config.yaml: auditConfig: auditFilePath: /var/log/audit-ocp.log enabled: true

In 3.7, it works, by following this procedure: https://github.com/rbo/openshift-examples/tree/master/efk-auditlog In 3.9 the filter does not work anymore. How should fluentd config file look like, so it will be happy with this new format ? Thank you, have a nice day !

richm commented 6 years ago

Not sure. You may have to ask the author of https://github.com/rbo/openshift-examples/tree/master/efk-auditlog to port the config to work with 3.9

StevenBarre commented 6 years ago

These changes got it working for me.

# diff -u input-auditlog.conf.orig input-auditlog.conf
--- input-auditlog.conf.orig    2018-07-09 16:13:42.798648052 -0700
+++ input-auditlog.conf 2018-07-09 16:14:06.830314628 -0700
@@ -4,7 +4,7 @@
   pos_file /var/log/auditlog.pos
   time_format %Y-%m-%dT%H:%M:%S
   tag auditlog.requests
-  format /^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}).*AUDIT:\s*id="(?<auditlog.id>.*?)"\s*(ip="(?<auditlog.ip>.*?)")?\s*(method="(?<auditlog.medthod>.*?)")?\s*(user="(?<auditlog.user>.*?)")?\s*(groups="(?<auditlog.groups>.*?)")?\s*(as="(?<auditlog.as>.*?)")?\s*(asgroups="(?<auditlog.asgroups>.*?)")?\s*(namespace="(?<auditlog.namespace>.*?)")?\s*(uri="(?<auditlog.uri>.*?)")?\s*(response="(?<auditlog.response>.*?)")?$/
+  format /^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}).*AUDIT:\s*id="(?<auditlog.id>.*?)"\s*(stage="(?<auditlog.stage>.*?)")?\s*(ip="(?<auditlog.ip>.*?)")?\s*(method="(?<auditlog.medthod>.*?)")?\s*(user="(?<auditlog.user>.*?)")?\s*(groups="(?<auditlog.groups>.*?)")?\s*(as="(?<auditlog.as>.*?)")?\s*(asgroups="(?<auditlog.asgroups>.*?)")?\s*(namespace="(?<auditlog.namespace>.*?)")?\s*(uri="(?<auditlog.uri>.*?)")?\s*(response="(?<auditlog.response>.*?)")?$/
 </source>

 <filter auditlog**>
@@ -12,7 +12,7 @@
   enable_ruby
   <record>
     @timestamp ${record['@timestamp'].nil? ? Time.at(time).getutc.to_datetime.rfc3339(6) : Time.parse(record['@timestamp']).getutc.to_datetime.rfc3339(6)}
-    auditlog.hostname ${_HOSTNAME.eql?('localhost') ? (begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; _HOSTNAME; end) : _HOSTNAME}
+    auditlog.hostname ${(begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; "localhost"; end)}
   </record>
 </filter>
skunkr commented 6 years ago

Thank you very much Steven ! Can you please post your actual input-auditlog.conf file ? I did the changes, but getting into some fluentd parsing errors ..... Thank you very much !

On Tue, Jul 10, 2018 at 2:15 AM, Steven Barre notifications@github.com wrote:

These changes got it working for me.

diff -u input-auditlog.conf.orig input-auditlog.conf

--- input-auditlog.conf.orig 2018-07-09 16:13:42.798648052 -0700 +++ input-auditlog.conf 2018-07-09 16:14:06.830314628 -0700 @@ -4,7 +4,7 @@ pos_file /var/log/auditlog.pos time_format %Y-%m-%dT%H:%M:%S tag auditlog.requests

  • format /^(?
  • format /^(?

    <filter auditlog**> @@ -12,7 +12,7 @@ enable_ruby

    @timestamp ${record['@timestamp'].nil? ? Time.at(time).getutc.to_datetime.rfc3339(6) : Time.parse(record['@timestamp']).getutc.to_datetime.rfc3339(6)}
  • auditlog.hostname ${_HOSTNAME.eql?('localhost') ? (begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; _HOSTNAME; end) : _HOSTNAME}
  • auditlog.hostname ${(begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; "localhost"; end)}

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openshift/origin-aggregated-logging/issues/1226#issuecomment-403648896, or mute the thread https://github.com/notifications/unsubscribe-auth/Aea8GNSmjA4vl19tllkcEHizoyRaMDODks5uE-P8gaJpZM4U7b95 .

-- Best regards, Leo David

StevenBarre commented 6 years ago

/etc/origin/master/master-config.yaml

auditConfig:
  auditFilePath: "/var/log/audit-ocp.log"
  enabled: true
  maximumFileRetentionDays: 7
  maximumFileSizeMegabytes: 50
  maximumRetainedFiles: 100
  logFormat: legacy
  policyFile: /etc/origin/master/audit-policy.yaml

input-auditlog.conf

<source>
  @type tail
  path /var/log/audit-ocp.log
  pos_file /var/log/auditlog.pos
  time_format %Y-%m-%dT%H:%M:%S
  tag auditlog.requests
  format /^(?<time>\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}).*AUDIT:\s*id="(?<auditlog.id>.*?)"\s*(stage="(?<auditlog.stage>.*?)")?\s*(ip="(?<auditlog.ip>.*?)")?\s*(method="(?<auditlog.medthod>.*?)")?\s*(user="(?<auditlog.user>.*?)")?\s*(groups="(?<auditlog.groups>.*?)")?\s*(as="(?<auditlog.as>.*?)")?\s*(asgroups="(?<auditlog.asgroups>.*?)")?\s*(namespace="(?<auditlog.namespace>.*?)")?\s*(uri="(?<auditlog.uri>.*?)")?\s*(response="(?<auditlog.response>.*?)")?$/
</source>

<filter auditlog**>
  @type record_transformer
  enable_ruby
  <record>
    @timestamp ${record['@timestamp'].nil? ? Time.at(time).getutc.to_datetime.rfc3339(6) : Time.parse(record['@timestamp']).getutc.to_datetime.rfc3339(6)}
    auditlog.hostname ${(begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; "localhost"; end)}
  </record>
</filter>

<match auditlog**>
  @type copy
  <store>
    @type elasticsearch_dynamic
    log_level debug
    host "#{ENV['OPS_HOST']}"
    port "#{ENV['OPS_PORT']}"
    scheme https
    index_name .operations.auditlog.${record['@timestamp'].nil? ? Time.at(time).getutc.strftime(@logstash_dateformat) : Time.parse(record['@timestamp']).getutc.strftime(@logstash_dateformat)}

    user fluentd
    password changeme

    client_key "#{ENV['OPS_CLIENT_KEY']}"
    client_cert "#{ENV['OPS_CLIENT_CERT']}"
    ca_file "#{ENV['OPS_CA']}"

    type_name com.redhat.ocp.auditlog

    # there is currently a bug in the es plugin + excon - cannot
    # recreate/reload connections
    reload_connections false
    reload_on_failure false
    flush_interval 5s
    max_retry_wait 300
    disable_retry_limit true
    buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '1024' }"
    buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '1m' }"
    # the systemd journald 0.0.8 input plugin will just throw away records if the buffer
    # queue limit is hit - 'block' will halt further reads and keep retrying to flush the
    # buffer to the remote - default is 'exception' because in_tail handles that case
    buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'exception'}"
  </store>
  #<store>
  #  @type stdout
  #</store>
</match>
skunkr commented 6 years ago

Thank you so much Steven !

On Wed, Aug 8, 2018 at 7:28 PM, Steven Barre notifications@github.com wrote:

/etc/origin/master/master-config.yaml

auditConfig: auditFilePath: "/var/log/audit-ocp.log" enabled: true maximumFileRetentionDays: 7 maximumFileSizeMegabytes: 50 maximumRetainedFiles: 100 logFormat: legacy policyFile: /etc/origin/master/audit-policy.yaml

input-auditlog.conf

@type tail path /var/log/audit-ocp.log pos_file /var/log/auditlog.pos time_format %Y-%m-%dT%H:%M:%S tag auditlog.requests format /^(?

<filter auditlog**> @type record_transformer enable_ruby

@timestamp ${record['@timestamp'].nil? ? Time.at(time).getutc.to_datetime.rfc3339(6) : Time.parse(record['@timestamp']).getutc.to_datetime.rfc3339(6)} auditlog.hostname ${(begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; "localhost"; end)}

<match auditlog**> @type copy

@type elasticsearch_dynamic log_level debug host "#{ENV['OPS_HOST']}" port "#{ENV['OPS_PORT']}" scheme https index_name .operations.auditlog.${record['@timestamp'].nil? ? Time.at(time).getutc.strftime(@logstash_dateformat) : Time.parse(record['@timestamp']).getutc.strftime(@logstash_dateformat)} user fluentd password changeme client_key "#{ENV['OPS_CLIENT_KEY']}" client_cert "#{ENV['OPS_CLIENT_CERT']}" ca_file "#{ENV['OPS_CA']}" type_name com.redhat.ocp.auditlog # there is currently a bug in the es plugin + excon - cannot # recreate/reload connections reload_connections false reload_on_failure false flush_interval 5s max_retry_wait 300 disable_retry_limit true buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '1024' }" buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '1m' }" # the systemd journald 0.0.8 input plugin will just throw away records if the buffer # queue limit is hit - 'block' will halt further reads and keep retrying to flush the # buffer to the remote - default is 'exception' because in_tail handles that case buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'exception'}"

@type stdout

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/openshift/origin-aggregated-logging/issues/1226#issuecomment-411468611, or mute the thread https://github.com/notifications/unsubscribe-auth/Aea8GDigJ4hVJ6OS1w6rfS4PML6YxM33ks5uOxHIgaJpZM4U7b95 .

-- Best regards, Leo David

duritong commented 5 years ago

With OCP 3.11 (likely also earlier) the logformat is now pure json and the following config works:

<source>
  @type tail
  path /var/lib/origin/oscp-audit/oscp-audit.log
  pos_file /var/log/auditlog.pos
  time_format %Y-%m-%dT%H:%M:%S
  tag auditlog.requests
  format json
  time_key timestamp
  time_format %iso8601
</source>

<filter auditlog**>
  @type record_transformer
  enable_ruby
  <record>
    @timestamp ${record['@timestamp'].nil? ? Time.at(time).getutc.to_datetime.rfc3339(6) : Time.parse(record['@timestamp']).getutc.to_datetime.rfc3339(6)}
    auditlog.hostname ${(begin; File.open('/etc/docker-hostname') { |f| f.readline }.rstrip; rescue; "localhost"; end)}
  </record>
</filter>

<match auditlog**>
  @type copy
  <store>
    @type elasticsearch_dynamic
    log_level debug
    host "#{ENV['OPS_HOST']}"
    port "#{ENV['OPS_PORT']}"
    scheme https
    index_name .operations.auditlog.${record['@timestamp'].nil? ? Time.at(time).getutc.strftime(@logstash_dateformat) : Time.parse(record['@timestamp']).getutc.strftime(@logstash_dateformat)}
    user fluentd
    password changeme
    client_key "#{ENV['OPS_CLIENT_KEY']}"
    client_cert "#{ENV['OPS_CLIENT_CERT']}"
    ca_file "#{ENV['OPS_CA']}"
    type_name com.redhat.ocp.auditlog
    # there is currently a bug in the es plugin + excon - cannot
    # recreate/reload connections
    reload_connections false
    reload_on_failure false
    flush_interval 5s
    max_retry_wait 300 
    disable_retry_limit true
    buffer_queue_limit "#{ENV['BUFFER_QUEUE_LIMIT'] || '1024' }"
    buffer_chunk_limit "#{ENV['BUFFER_SIZE_LIMIT'] || '1m' }"
    # the systemd journald 0.0.8 input plugin will just throw away records if the buffer
    # queue limit is hit - 'block' will halt further reads and keep retrying to flush the 
    # buffer to the remote - default is 'exception' because in_tail handles that case
    buffer_queue_full_action "#{ENV['BUFFER_QUEUE_FULL_ACTION'] || 'exception'}"
  </store>
  #<store>
  #  @type stdout
  #</store>
</match>
jefflee416 commented 5 years ago

Hi duritong,

I tried your config on ocp 3.11, however I found there is a OCP system issue to prevent this from working.

Basically, when we enable auditConfig, the api audit logs have to be written to a file. At this moment, master-api pod can only write to /var/lib/origin Because only this directory is mounted to master-api pod. This explains why we can’t write to other directories of the host.

Then in order to send this log to EFK, the directory has to be accessible by fluentd. Unfortunately, fluend only mounts /var/log, and /var/docker directories From host, hence fluentd can’t see the api log files.

So how do you solve this challenge?

duritong commented 5 years ago

You can either add the subpath in /var/lib/origin also to the fluentd daemonset or - which is what I am doing in other places - deploy another daemonset dedicated to the master with fluent-bit to send out that logfile. This is really simple to get implemented.