fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.82k stars 1.58k forks source link

[filter_kubernetes] enhancement: provide mechanism to exclude containers from fluent bit via annotations #737

Open therealdwright opened 6 years ago

therealdwright commented 6 years ago

Problem Statement In the current implementation the most common way to get container logs parsed by fluentbit in a kubernetes cluster is to have a filter applied to containers using a log message like the one detailed here:

input-kubernetes.conf: |
  [INPUT]
      Name              tail
      Tag               kube.*
      Path              /var/log/containers/*.log
      Parser            docker
      DB                /var/log/flb_kube.db

This represents a bit of a problem with dev containers sometimes polluting our log platform with unwanted logs. We have a select few services we want to include, but the default rule is to exclude.

Describe the solution you'd like

A simple way to include logs in the opposite of how #555 was implemented where you can annotate containers with a fluentbit.io/include: "true" and the fluentbit daemonset will only pick up these logs.

Describe alternatives you've considered

I've updated my Path in the above config to /var/log/containers/*deployment*.log and ensured all my deployments I want to aggregate logs for have deployment in the name.

Additional context

I have a Kubernetes cluster set up with kops 1.10 and used https://github.com/fluent/fluent-bit-kubernetes-logging to set up fluentbit which then forwards to a fluentd service running the logz.io plugin.

edsiper commented 6 years ago

looks like I misunderstood the original requirement, you can try the following annotation:

fluentbit.io/exclude: "true"
therealdwright commented 6 years ago

Hi @edsiper thanks for replying, but I think I've poorly worded this enhancement sorry.

I want to have fluent-bit by default exclude, and annotate a select few deployments we can include.

kaspernissen commented 5 years ago

I'm searching for a similar thing. I want to e.g. say that all pods with label: something should be included, otherwise just discard the logs. Maybe something like:

 K8S-Logging.include label=something,label=anotherthing
botzill commented 5 years ago

Hi. is there any solution to this so far? I'm also looking for this. Will be really helpful.

Thx!

slayerjain commented 5 years ago

Hey, Has anyone made any progress here? I'm also trying to figure out a solution to this :)

infa-ddeore commented 4 years ago

this will be very useful feature and end users will have control on sending logs by adding annotations

albertocsm commented 4 years ago

+1 option to exclude by default and opt in include

infa-ddeore commented 4 years ago

sad that this issue is open since Sep 2018 and still no solution yet

dmytroleonenko commented 4 years ago

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

infa-ddeore commented 4 years ago

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

how to enable include mode and do you have docker image already built?

dmytroleonenko commented 4 years ago

https://github.com/dmytroleonenko/fluent-bit/tree/v1.4.2-include In case anybody interested. I'm not sure the way I injected the logic would be fine with the upstream dev team. Works for me. If the config has include mode enabled, only pods with

fluentbit.io/include: "true"

annotations are sent

how to enable include mode and do you have docker image already built?

You can enable it the same way like exclude mode, just use "include" instead of "exclude" word in both config and annotation. Include should work with "exclude" in combo equally well if you want to exclude particular container from the pod or specific stream (think of stdout) off of a pod log. I use https://github.com/aws/aws-for-fluent-bit.git to build an image for EKS logger. Slightly modified their Dockerfile to get fluent-bit sources from a zip file (based on my fork sources) instead of their git clone way I think I can build/push an image to the Dockerhub. Check it here https://hub.docker.com/r/melco/aws-for-fluent-bit once DockerHub manages to build it

edsiper commented 4 years ago

let me confirm the expectation from for the default behavior:

One of K8S-Logging.Exclude or K8S-Logging.Include must be enabled (not both), behaviors:

K8S-Logging.Exclude K8S-Logging.Include Pod Annotation Process Log ?
On Off exclude: "true" No
On Off exclude: "false" Yes
Off Off any Yes
Off On include: "true" Yes
Off On include: "false" No

comments ?

therealdwright commented 4 years ago

@edsiper I think your approach makes the most sense. We've taken another path but it sounds like there are others who would like this enhancement.

dmytroleonenko commented 4 years ago

let me confirm the expectation from for the default behavior:

One of K8S-Logging.Exclude or K8S-Logging.Include must be enabled (not both), behaviors:

K8S-Logging.Exclude K8S-Logging.Include Pod Annotation Process Log ? On Off exclude: "true" No On Off exclude: "false" Yes Off Off any Yes Off On include: "true" Yes Off On include: "false" No comments ?

If the "false" annotation is a default behavior (no annotation == false annotation). Like if K8S-Logging.Include On then if I don't have any annotations on any pods what would happen?

edsiper commented 4 years ago

@dmytroleonenko the proposal above says: if K8S-Logging.Include is turned on, only the Pods that have an annotation fluentbit.io/include: "true" will be included in the pipeline, otherwise discarded.

dmytroleonenko commented 3 years ago

Any progress? Thanks

RainingNight commented 3 years ago

Any news?

jackmahoney commented 2 years ago

I had no luck so I investigated my container input. FluentBit mounts all your container log files into /var/log/containers. The [INPUT] section in the config of most standard installations uses a wildcard to match all containers. Modify the input on the agent to include only the containers you want and that will exclude all others.

Standard config wildcard (NOTE the Path field):

  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/*
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

Change the Path field to match the containers you wish. This can be comma separated list of patterns:

  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/my-container.log, /var/log/containers/my-other-container.log
        Docker_Mode         On
        Docker_Mode_Flush   5
        Docker_Mode_Parser  container_firstline
        Parser              docker
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

You could also use the exclude part of the config. I wrote a blog post with more info.

martinkubrak commented 2 years ago

One solution is to use lua filter to drop records based on labels/annotations. An example that drops all records unless pods have process-logs="true" label:

function drop_disabled_logs(tag, timestamp, record)
  if record["kubernetes"]["labels"]["process-logs"] == "true" then
      return 0, 0, 0
  else
    return -1, 0, 0
  end
end
husseinraoouf commented 1 year ago

I think exclude all container log collection by default and only include it for containers with annotation is important use case.

side note Isn't it better if we can skip reading log files that we don't want in the first place, so if the pod isn't annotated we don't even read the file, as i understand the proposed solution would make us tail all files then filter logs out based on annotation,

side note if we can make every pod provides it's log config in annotations that would allow for maximum customization like what datadog agent do in here

for example

apiVersion: v1
kind: Pod
metadata:
  name: logger
  namespace: logger-ns
  annotations:
    fluentbit.io/config: |
      [INPUT]
          Name           tail
          Tag               kube.*
          Path             /var/log/pods/logger-ns_logger*/busybox/*.log
          Parser          docker
          DB                /var/log/flb_kube.db
spec:
  containers:
   - name: busybox
     image: busybox
     command: [ "/bin/sh", "-c", "--" ]
     args: [ "while true; do sleep 1; echo `date` example file log; done;" ]

this would allow maximum customization and could even be enhanced by not requiring all this info because the log files path would always be like this /var/log/pods/<namespace-name>_<pod-name>*/<container-name>/*.log and we can get those info from the metadata, so that would make the annotation closer to what datadog do

annotations:
    fluentbit.io/<container-name>.config: |
      [INPUT]
          Parser            docker
BibbyChung commented 1 year ago

One solution is to use lua filter to drop records based on labels/annotations. An example that drops all records unless pods have process-logs="true" label:

function drop_disabled_logs(tag, timestamp, record)
  if record["kubernetes"]["labels"]["process-logs"] == "true" then
      return 0, 0, 0
  else
    return -1, 0, 0
  end
end

I change some code making this solution works well. ^^|||

 return nil, nil, nil
gihif commented 1 year ago

as long as this feature cannot be implemented yet right now I just do like this

...
    [INPUT]
      Name tail
      Path /var/log/containers/*.log
      multiline.parser docker, cri
      Tag kube.*
      Mem_Buf_Limit 5MB
      Skip_Long_Lines On
      Skip_Empty_Lines On

    [FILTER]
      Name kubernetes
      Match kube.*
      Merge_Log On
      Labels On
      Annotations Off
      Keep_Log Off
      K8S-Logging.Parser On
      K8S-Logging.Exclude On
      Buffer_Size 256KB

    [FILTER]
      Name    grep
      Match   kube.*
      regex   $kubernetes['labels']['logging'] enabled
...

so, only pods have

metadata:
  labels:
    logging: enabled

which will result the log to OUTPUT