splunk / splunk-connect-for-kubernetes

Helm charts associated with kubernetes plug-ins
Apache License 2.0
344 stars 270 forks source link

Feature: Ignore logs from containers/namespaces #152

Closed vidarw closed 5 years ago

vidarw commented 5 years ago

What would you like to be added: We need a setting to ignore certain containers and/or namespaces.

Why is this needed: We are using a default AKS setup and certain built-in containers/namespaces are flooding our Splunk index. In particular kube:container:redirector is currently pushing 70% of the logdata and log information is almost useless under normal operation.

matthewmodestino commented 5 years ago

Hi vidarw,

Can I assume you are deploying with HELM?

In the meantime, I can help you achieve this by simply updating your logging pod's configmap.

  <source>
      @id containers.log
      @type tail
      @label @SPLUNK
      tag tail.containers.*
      path /var/log/containers/*.log
      pos_file /var/log/splunk-fluentd-containers.log.pos
      path_key source
      read_from_head true
      <parse>
        @type json
        time_key time
        time_type string
        time_format %Y-%m-%dT%H:%M:%S.%NZ
        localtime false
      </parse>
    </source>

The path in the fluentd source configuration allows you to set source path logic to filter out anything you don't want or need.

https://docs.fluentd.org/input/tail#path

A PR for allowing the helm chart to set this path should be pretty straightforward.

Will make sure we have a look.

chaitanyaphalak commented 5 years ago

Fixed in https://github.com/splunk/splunk-connect-for-kubernetes/pull/174

yogeshbidari commented 5 years ago

Hi Vidarw,

We are using the exclude path in values.yaml file, still it is not restricting the logs. You can find the below code for excluding the files:

`fluentd: exclude_path:

Even we are trying to restrict the logs from kube-system namespace completely, but we are not able to achieve it. We used "match" block to restrict the logs from kube-system namespace in output.conf section of configMap.yaml file.

output.conf: |- <match kubernetes.var.log.containers.**kube-system**.log> @type null </match>

Any further configuration is needed for restricting the kube-system namespace logs. Please assist on this.

PrabuddhaRaj commented 4 years ago

@yogeshbidari : were you able to resolve this issue? i am also not able to restrict the various container logs

matthewmodestino commented 4 years ago

@yogeshbidari your use of double asterisks is incorrect. It should be single asterisks!!

@PrabuddhaRaj More details please!!! Can we see your config?? What does your configmap look like in cluster once deployed??

PrabuddhaRaj commented 4 years ago

HI @matthewmodestino
fluentd:

path: /var/log/containers/analytics.log, /var/log/containers/c4c-mock.log, /var/log/containers/commerce-mock.log, /var/log/containers/marketing-mock.log

exclude_path:

matthewmodestino commented 4 years ago

ok, but what version of the chart, and what does the resulting configmap look like, specifically the source section for containers.log

Maybe better to open ur own issue with more details

PrabuddhaRaj commented 4 years ago
<source>
  @id containers.log
  @type tail
  @label @CONCAT
  tag tail.containers.*
  path /var/log/containers/*analytics*.log, /var/log/containers/*c4c-mock*.log, /var/log/containers/*commerce-mock*.log, /var/log/containers/*marketing-mock*.log
  exclude_path ["/var/log/containers/kube-svc-redirect*.log","/var/log/containers/tiller*.log","/var/log/containers/*_kube-system_*.log (to exclude `kube-system` namespace)","/var/log/containers/*splunklogging*.log"]
  pos_file /var/log/splunk-fluentd-containers.log.pos
  path_key source
  read_from_head true
  <parse>
    @type json
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    time_key time
    time_type string
    localtime false
    refresh_interval 60
  </parse>
</source>
source.files.conf:
----
# This fluentd conf file contains sources for log files other than container logs.
<source>
  @id tail.file.kube-audit
  @type tail
  @label @CONCAT
  tag tail.file.kube:apiserver-audit
  path /var/log/kube-apiserver-audit.log
  pos_file /var/log/splunk-fluentd-kube-audit.pos
  read_from_head true
  path_key source
  <parse>
    @type regexp
    expression /^(?<log>.*)$/

    time_key time
    time_type string
    time_format %Y-%m-%dT%H:%M:%SZ
  </parse>
</source>

@matthewmodestino : i think that splunklogging side pod is not part of the container logs that is the reason it is not gettting excluded

PrabuddhaRaj commented 4 years ago

@matthewmodestino chart version is 1.4.2

matthewmodestino commented 4 years ago

@PrabuddhaRaj

Can you elaborate on the source of these logs? did you add them as a custom log source?

PrabuddhaRaj commented 4 years ago

@matthewmodestino maybe i didnt get your question but i added them in values.yaml file in the path section of fluentd. i am running two pods currently in my node c4c-mock-** splunklogging-splunk-kubernetes-logging i want logs only from c4c-mock pod but i am getting from both. c4c-mock is my application. below is the log section of values.yaml file that i am using. logs:
analytics: from: pod: analytics container: analytics timestampExtraction: regexp: (?