Closed peterbosalliandercom closed 7 months ago
I think hostname is also part of the kubernetes_metadata so this may be related: https://github.com/banzaicloud/logging-operator/issues/1133
I beleive this is already supported with https://kube-logging.github.io/docs/configuration/plugins/filters/enhance_k8s/ @peterbosalliandercom could you please check?
Feel free to reopen this is not a solution to your problem!
Sorry it seems that the original plugin does not support namespace metadata, so I recommend opening an issue there and we can pull it into the fluentd image: https://github.com/SumoLogic/sumologic-kubernetes-fluentd/blob/main/fluent-plugin-enhance-k8s-metadata/lib/fluent/plugin/filter_enhance_k8s_metadata.rb
Even if FluentBit and enhance_k8s would support this, would it still not make sense to have this option to get information from FluentD? FluentD is generally positioned to be heavier to also more feature-rich than FluentBit. So what is speaking against a feature that is already supported there?
Sorry it seems that the original plugin does not support namespace metadata, so I recommend opening an issue there and we can pull it into the fluentd image: https://github.com/SumoLogic/sumologic-kubernetes-fluentd/blob/main/fluent-plugin-enhance-k8s-metadata/lib/fluent/plugin/filter_enhance_k8s_metadata.rb
It is not supported here: https://github.com/SumoLogic/sumologic-kubernetes-fluentd/blob/main/fluent-plugin-enhance-k8s-metadata/lib/fluent/plugin/filter_enhance_k8s_metadata.rb
but is supported here: https://github.com/SumoLogic/sumologic-kubernetes-fluentd/blob/main/fluent-plugin-kubernetes-metadata-filter/lib/fluent/plugin/kubernetes_metadata_common.rb#L53
In general the issue with depending on kubernetes metadata which is enhanced in the aggregator layer is that the pod might already be long gone, when you try to get this information.
We can take a look and support the mentioned filter if that helps users in certain use cases, but I would strongly recommend to add the labels you need to the pod directly. I understand that this can be difficult to do, but in case someone is running a multi-tenant cluster it should be beneficial to mutate pods (with the help of a policy engine) even if they don't have the right labels.
My previous comment is not entirely accurate since the plugin holds a cache of namespaces objects, so it does not necessarily requires a pod to exist, however the information might still be inconsistent since the namespace metadata might change between the log generation and processing.
Anyways I checked and we already pull that plugin into the image, we just don't specify a wrapper for it, which is something we can do.
In general the issue with depending on kubernetes metadata which is enhanced in the aggregator layer is that the pod might already be long gone, when you try to get this information.
We can take a look and support the mentioned filter if that helps users in certain use cases, but I would strongly recommend to add the labels you need to the pod directly. I understand that this can be difficult to do, but in case someone is running a multi-tenant cluster it should be beneficial to mutate pods (with the help of a policy engine) even if they don't have the right labels.
Is this also the case with enhanceK8s plugin? I would think so. If not, namespace_labels could theoretically also be added there by using the code from kubernetes_metadata (which seems a lot easier than adding to FluentBit buildin Kubernetes filter, which was mentioned here https://github.com/fluent/fluent-bit/issues/6544 )
I'm open to adding anything that could help users here, even if I still think that doing it on the aggregator level is suboptimal.
If someone can show a working solution in fluentd lingo using any of the above sumologic plugins I'm happy to help integrating those.
In the meantime I'm thinking the logging operator could help with a mutating webhook. Users wouldn't need another tool, but the operator could ensure to add specific labels to pods on the fly if those exist on the namespace. That would be the most reliable solution in my opinion.
there is a good chance this is going to be supported in fluentbit soon: https://github.com/fluent/fluent-bit/pull/8279
Namespace labels are already available see: https://kube-logging.dev/docs/whats-new/#kubernetes-namespace-labels-and-annotations
The use case described above can already be solved by using the latest logging operator version 4.6+ and fluentbit 3.+
Also the use case can be implemented using the multi-tenant architecture supported by logging operator version 4.5+ using isolated aggregators for the tenants with the loggingroute resource:
I'm closing this now
What is the problem: We are using the logging-operator with fluentbit/fluentd for our logcollection. The output flows through fluentd to logstash to ES. In ES we find that we only get a subset of kubernetes metadata in the index. We are using the kubernetes.namespace_labels to identify the which tenant the logging comes from. We use the following label which is placed by the Capsule (https://capsule.clastix.io/) operator to identify the tenant: capsule.clastix.io/tenant: tenantxxx ). In logstash we want to split indexes per tenant by using this namespace_label. The problem is that the logging-operator does not send namespace_labels by default, and it cannot be configured by using enhancedk8s.
What do we like: We want to have fluentd filter option (in enhancedk8s?) to add kubernetes_metadata itself in the logging-operator (override the fluentbit metadata). The filteroption like this
<filter **> @type kubernetes_metadata </filter>
Background We are using the BanzaiCloud logging-operator (https://banzaicloud.com/docs/one-eye/logging-operator) so the configuration fixed and defaults to the use of fluentbit for kubernetes metadata, we cannot change that so there is now way to fix it without breaking the operator functionality. (related issue: https://github.com/banzaicloud/logging-operator/issues/704) The only way is by overriding everything the operator does (rbac) and fluentd config, but then the whole point of using the operator is lost and it cannot be maintained properly.