fluentd multiplexed outputer

fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows

https://fluentbit.io

Apache License 2.0

5.89k stars 1.59k forks source link

fluentd multiplexed outputer #242

Open kfox1111 opened 7 years ago

kfox1111 commented 7 years ago

For K8s, could we come up with a filter/outputter that forwarded traffic to the fluentd located in the namespace for the pod being collected for?

And as a bonus, the k8s service name can be overridden in a pod attribute?

That would allow the k8s cluster admin to setup 1 fluent-bit daemonset for everyone, and each project can setup their own filtering rules in their own fluentd(s) running under their own control.

edsiper commented 7 years ago

If I am understand correctly you want to point each Fluent Bit to a Fluentd named with the 'namespace' like fluentd-kube-system. If so I think is easier to configure this in your Yaml file using the Downward API to obtain your namespace:

https://kubernetes.io/docs/user-guide/downward-api/

So having the namespace in an environment variable then you can use it in Fluent Bit, e.g:

[OUTPUT] 
    Name es
    Match *
    Host  fluentd-${MY_NAMESPACE}

if that is not what you mean, please let me know

kfox1111 commented 7 years ago

No, thats not what I meant. I'm thinking, have one fluent-bit per system.

but, be able to send the logs to different k8s fluentd's based on the namespace of the pod who's logs are being collected.

like, if you have two namespaces: A, and B,

A can launch a fluentd, B can launch a fluentd.

the admin launches fluent-bit daemonset across all nodes.

In this mode, all traffic out of a pod in A, will get sent to A's fluentd. and B's to B's fluentd.

edsiper commented 7 years ago

Now I get it.

Since the namespace "already" exists in the file name and in the system, the goal would be to implement a way into the filter_kubernetes to "add" this namespace name in the tag, so then it can be routed using a specific match pattern.

I will think a bit more about how to implement it.

edsiper commented 7 years ago

@kfox1111 just doing follow up, is your goal to have a Fluentd aggregator per namespace so the above implementation proposal makes routing easier ?

mvladev commented 6 years ago

@edsiper I think he was trying to have dynamic routing based on the filtered data. Something like this:

# I use ES as an example, but the same principle applies to any output
[OUTPUT]
    Name            es
    Match           kube*
    Host            elasticsearch.${KUBE_NAMESPACE_COMING_FROM_THE_TAG}.svc
    Port            8080
    Logstash_Format On
    Retry_Limit     False

This KUBE_NAMESPACE_COMING_FROM_THE_TAG is not coming from environment variables, but from the data / tag and it's dynamically evaluated.

If the log is from namespace A then it should go to elasticsearch.A.svc If the log is from namespace B then it should go to elasticsearch.B.svc

mvladev commented 6 years ago

To be more specific if we have an event:

 Tag   = kube.var.log.containers.etcd-monotop_kube-system_etcd-3f1ed117c42aaa559199f97fe5149913349cd9cab35cd64b6aa073e2dd6ac23d.log:
 Event = [1487286363, {"log"=>"2017-02-07 19:56:54.983750 I | etcdmain: etcd Version: 3.0.14\n",
                       "stream"=>"stderr",
                       "time"=>"2017-02-07T19:56:54.984354129Z",
                       "kubernetes" => {"pod_name"=>"etcd-monotop",
                                        "namespace_name"=>"kube-system",
                                        "container_name"=>"etcd",
                                        "docker_id"=>"3f1ed117c42aaa559199f97fe5149913349cd9cab35cd64b6aa073e2dd6ac23d"}
                      }
         ]

We want to use the value from Event => kubernetes => namespace_name in the output's routing host

kfox1111 commented 6 years ago

Yes, thats what I'm interested in.

The person maintaining the namespaced set of services (tenant user) and the person maintaining fluent-bit daemon/config (k8s admin) could be on completely different teams with different rights. Letting the logs flow to processes maintained at the tenant level would allow the users to self service the processing/storage of their own aggregated logs without involving the k8s admin.

kfox1111 commented 6 years ago

Any update on this?

hwasp commented 6 years ago

I would love to have something like this. Any update?

kfox1111 commented 5 years ago

Any updates?

zeph commented 5 years ago

:+1:

dabr-grapeup commented 5 years ago

Any update or a suggestion how to implement this behaviour with latest version?

edsiper commented 3 months ago

hi folks, coming back to this, trying to understand if still is something highly desired. thanks