Open kfox1111 opened 7 years ago
If I am understand correctly you want to point each Fluent Bit to a Fluentd named with the 'namespace' like fluentd-kube-system. If so I think is easier to configure this in your Yaml file using the Downward API to obtain your namespace:
https://kubernetes.io/docs/user-guide/downward-api/
So having the namespace in an environment variable then you can use it in Fluent Bit, e.g:
[OUTPUT]
Name es
Match *
Host fluentd-${MY_NAMESPACE}
if that is not what you mean, please let me know
No, thats not what I meant. I'm thinking, have one fluent-bit per system.
but, be able to send the logs to different k8s fluentd's based on the namespace of the pod who's logs are being collected.
like, if you have two namespaces: A, and B,
A can launch a fluentd, B can launch a fluentd.
the admin launches fluent-bit daemonset across all nodes.
In this mode, all traffic out of a pod in A, will get sent to A's fluentd. and B's to B's fluentd.
Now I get it.
Since the namespace "already" exists in the file name and in the system, the goal would be to implement a way into the filter_kubernetes to "add" this namespace name in the tag, so then it can be routed using a specific match pattern.
I will think a bit more about how to implement it.
@kfox1111 just doing follow up, is your goal to have a Fluentd aggregator per namespace so the above implementation proposal makes routing easier ?
@edsiper I think he was trying to have dynamic routing based on the filtered data. Something like this:
# I use ES as an example, but the same principle applies to any output
[OUTPUT]
Name es
Match kube*
Host elasticsearch.${KUBE_NAMESPACE_COMING_FROM_THE_TAG}.svc
Port 8080
Logstash_Format On
Retry_Limit False
This KUBE_NAMESPACE_COMING_FROM_THE_TAG
is not coming from environment variables, but from the data / tag and it's dynamically evaluated.
If the log is from namespace A
then it should go to elasticsearch.A.svc
If the log is from namespace B
then it should go to elasticsearch.B.svc
To be more specific if we have an event:
Tag = kube.var.log.containers.etcd-monotop_kube-system_etcd-3f1ed117c42aaa559199f97fe5149913349cd9cab35cd64b6aa073e2dd6ac23d.log:
Event = [1487286363, {"log"=>"2017-02-07 19:56:54.983750 I | etcdmain: etcd Version: 3.0.14\n",
"stream"=>"stderr",
"time"=>"2017-02-07T19:56:54.984354129Z",
"kubernetes" => {"pod_name"=>"etcd-monotop",
"namespace_name"=>"kube-system",
"container_name"=>"etcd",
"docker_id"=>"3f1ed117c42aaa559199f97fe5149913349cd9cab35cd64b6aa073e2dd6ac23d"}
}
]
We want to use the value from Event
=> kubernetes
=> namespace_name
in the output's routing host
Yes, thats what I'm interested in.
The person maintaining the namespaced set of services (tenant user) and the person maintaining fluent-bit daemon/config (k8s admin) could be on completely different teams with different rights. Letting the logs flow to processes maintained at the tenant level would allow the users to self service the processing/storage of their own aggregated logs without involving the k8s admin.
Any update on this?
I would love to have something like this. Any update?
Any updates?
:+1:
Any update or a suggestion how to implement this behaviour with latest version?
hi folks, coming back to this, trying to understand if still is something highly desired. thanks
For K8s, could we come up with a filter/outputter that forwarded traffic to the fluentd located in the namespace for the pod being collected for?
And as a bonus, the k8s service name can be overridden in a pod attribute?
That would allow the k8s cluster admin to setup 1 fluent-bit daemonset for everyone, and each project can setup their own filtering rules in their own fluentd(s) running under their own control.