splunk / splunk-connect-for-kubernetes

Helm charts associated with kubernetes plug-ins
Apache License 2.0
344 stars 270 forks source link

unable to remove logs of specific container #456

Closed PrabuddhaRaj closed 4 years ago

PrabuddhaRaj commented 4 years ago

What happened: I wanted to exclude the logs of my sidecar pod running on my kubernetes node but even after excluding it from path, i see the logs in splunk. Secondly: i want to receive only application logs from c4c-mock pod which is my main application. below is my values.yaml file code fluentd:

path: /var/log/containers/analytics.log, /var/log/containers/c4c-mock.log, /var/log/containers/commerce-mock.log, /var/log/containers/marketing-mock.log

exclude_path:

What you expected to happen: if you see below is my configMap deployed and it has the exclude path but my splunk still has logs from that pod.

<source>
  @id containers.log
  @type tail
  @label @CONCAT
  tag tail.containers.*
  path /var/log/containers/*analytics*.log, /var/log/containers/*c4c-mock*.log, /var/log/containers/*commerce-mock*.log, /var/log/containers/*marketing-mock*.log
  exclude_path ["/var/log/containers/kube-svc-redirect*.log","/var/log/containers/tiller*.log","/var/log/containers/*_kube-system_*.log (to exclude `kube-system` namespace)","/var/log/containers/*splunklogging*.log"]
  pos_file /var/log/splunk-fluentd-containers.log.pos
  path_key source
  read_from_head true
  <parse>
    @type json
    time_format %Y-%m-%dT%H:%M:%S.%NZ
    time_key time
    time_type string
    localtime false
    refresh_interval 60
  </parse>
</source>
source.files.conf:
----
# This fluentd conf file contains sources for log files other than container logs.
<source>
  @id tail.file.kube-audit
  @type tail
  @label @CONCAT
  tag tail.file.kube:apiserver-audit
  path /var/log/kube-apiserver-audit.log
  pos_file /var/log/splunk-fluentd-kube-audit.pos
  read_from_head true
  path_key source
  <parse>
    @type regexp
    expression /^(?<log>.*)$/

    time_key time
    time_type string
    time_format %Y-%m-%dT%H:%M:%SZ
  </parse>
</source>

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?: below is the log section of values.yaml file that i am using.

logs:
analytics:
from:
pod: analytics
container: analytics
timestampExtraction:
regexp: (?\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
c4c-mock:
from:
pod: c4c-mock
container: c4c-mock
timestampExtraction:
regexp: (?\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
commerce-mock:
from:
pod: commerce-mock
container: commerce-mock
timestampExtraction:
regexp: (?\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"
marketing-mock:
from:
pod: marketing-mock
container: marketing-mock
timestampExtraction:
regexp: (?\d{4}-\d{2}-\d{2} [0-2]\d:[0-5]\d:[0-5]\d.\d{6})
format: "%Y-%m-%d %H:%M:%S.%N"

Environment:

matthewmodestino commented 4 years ago

Hi!

I think there may be some confusion on the source of those logs.

Can you confirm the sourcetype of the event you are trying to exclude?

rockb1017 commented 4 years ago

by the way, you should delete " (to exclude kube-system namespace)" this portion. regarding annotation, you can add the annotation to you deployment.yaml file.

PrabuddhaRaj commented 4 years ago

by the way, you should delete " (to exclude kube-system namespace)" this portion. regarding annotation, you can add the annotation to you deployment.yaml file.

thanks @rockb1017 actually i resolved that issue by removing from the configMap.yaml, @include source.files.conf

@include source.journald.conf

@include monit.conf

but i can try what you have suggested. secondly i am getting

Screen Shot 2020-09-16 at 12 37 02 AM Screen Shot 2020-09-16 at 12 37 15 AM

i want to delete these logs from my application container which is marketing mock. hence within my container there are all these unwanted logs. any suggestion on how to remove these logs?

PrabuddhaRaj commented 4 years ago

@rockb1017 : i was able to resolve this by using grep plugin and getting only application access logs to splunk. But i want to convert the log messages in text format to JSON. i tried using the below code with formatter json plugin <match tail.containers.*> @type file path /var/log/containers/marketing-mock*.log format> @type json /format> But this doesnt work. any way to convert in json the application logs in fluentd. i used the parser json plugin too

Screen Shot 2020-09-17 at 5 04 40 PM
rockb1017 commented 4 years ago

based on your screenshot, it seems it is already json? the event is just a string. [INFO] [API Access Log]: [timestamp] "GET http/1.1" "/api.........
what plugin are you trying to use in this block?

<match tail.containers.**>
@type file
path /var/log/containers/marketing-mock.log
format>
@type json
/format>
matthewmodestino commented 4 years ago

yeah I'm not sure I am following what you are configuring and worry we may be struggling with something we already solved for..

Are you a current customer? Maybe a zoom with your account team will help?

You can tell your SE to reach out to me and we can get this sorted, or join the slack chat splk.it/slack

PrabuddhaRaj commented 4 years ago

based on your screenshot, it seems it is already json? the event is just a string. [INFO] [API Access Log]: [timestamp] "GET http/1.1" "/api......... what plugin are you trying to use in this block?

<match tail.containers.**>
@type file
path /var/log/containers/marketing-mock.log
format>
@type json
/format>

@rockb1017 : actually i am doing this in output.conf but i think you are correct as i dont have any key attached to the initial [INFO] API access log and hence i cannot convert it into JSON the way i want it. the parser might need a key value pair. my use case is my comany has various LOB's where each lOB is logging in a different format. so i need a standardized way to convert it into JSON from fluentd and hence make some meaningful insights from logs. the order of entities in a log might be different or key for one log can be applicationID but another one could have appid.Some lobs's are sending in json which i am easily able to consume in splunk. what is the best solution to implement this ? i think i can enhance the logs using tagging or parsing it in such a way that i can add key value pairs and hence make it easily consumable in splunk.

PrabuddhaRaj commented 4 years ago

yeah I'm not sure I am following what you are configuring and worry we may be struggling with something we already solved for..

Are you a current customer? Maybe a zoom with your account team will help?

You can tell your SE to reach out to me and we can get this sorted, or join the slack chat splk.it/slack

@matthewmodestino please drop me any invite on my email prabuddha.raj@sap.com and we can proceed ahead

PrabuddhaRaj commented 4 years ago

based on your screenshot, it seems it is already json? the event is just a string. [INFO] [API Access Log]: [timestamp] "GET http/1.1" "/api......... what plugin are you trying to use in this block?

<match tail.containers.**>
@type file
path /var/log/containers/marketing-mock.log
format>
@type json
/format>

@rockb1017 : actually i am doing this in output.conf but i think you are correct as i dont have any key attached to the initial [INFO] API access log and hence i cannot convert it into JSON the way i want it. the parser might need a key value pair. my use case is my comany has various LOB's where each lOB is logging in a different format. so i need a standardized way to convert it into JSON from fluentd and hence make some meaningful insights from logs. the order of entities in a log might be different or key for one log can be applicationID but another one could have appid.Some lobs's are sending in json which i am easily able to consume in splunk. what is the best solution to implement this ? i think i can enhance the logs using tagging or parsing it in such a way that i can add key value pairs and hence make it easily consumable in splunk.

hi @rockb1017 , is there a way to just fetch some key value pairs from the log and pass it as json via the record_transformer plugin. <filter tail.containers.**> @type record_transformer enable_ruby

# set the sourcetype from splunk.com/sourcetype pod annotation or set it to kube:container:CONTAINER_NAME sourcetype ${record.dig("kubernetes", "annotations", "splunk.com/sourcetype") ? "kube:"+record.dig("kubernetes", "annotations", "splunk.com/sourcetype") : "kube:container:"+record.dig("kubernetes","container_name")} container_name ${record.dig("kubernetes","container_name")} namespace ${record.dig("kubernetes","namespace_name")} pod ${record.dig("kubernetes","pod_name")} container_id ${record.dig("docker","container_id")} pod_uid ${record.dig("kubernetes","pod_id")} container_image ${record.dig("kubernetes","container_image")} application_id ${record["applicationID:"]} trueCaller_ip ${record["TrueCallerIP:"]} tenant_id ${record["TenantID:"]} am trying to fetch applicationID as json from the log.
rockb1017 commented 4 years ago

you can use regex parsing to extract what you need. Please refer this fluentd document. https://docs.fluentd.org/filter/parser Thank you and i will close this!