fabric8io / fluent-plugin-kubernetes_metadata_filter

Enrich your fluentd events with Kubernetes metadata
Apache License 2.0
350 stars 166 forks source link

Elasticsearch k8s labels issue #378

Closed danieltaub96 closed 7 months ago

danieltaub96 commented 8 months ago

After upgarding to latest fluentd chart, started to get error on the labels the plugin produces 2023-12-30 09:13:35 +0000 [warn]: #0 fluent/log.rb:383:warn: dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value'"

is there option to add dedot flag to this plugin?

jcantrill commented 7 months ago

ref #370

xdubois commented 7 months ago

Hello @jcantrill,

How would you implement the extra configuration on this popular docker image ?

https://github.com/fluent/fluentd-kubernetes-daemonset/tree/master/docker-image/v1.16/debian-elasticsearch8/conf

That would unblock a lot of people as many projects still use the label "app" for their application name on top of the standard app.kubernetes.io/name label

I've been looking/trying some work around without success including the one mentinoned in https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/issues/370

Thank you

jcantrill commented 7 months ago

Exactly as identified in the other issue:

  <filter **>
    @type record_modifier
    <record>
    _dummy_ ${if m=record.dig("kubernetes","namespace_labels");record["kubernetes"]["namespace_labels"]={}.tap{|n|m.each{|k,v|n[k.gsub(/[.\/]/,'_')]=v}};end}
    _dummy2_ ${if m=record.dig("kubernetes","labels");record["kubernetes"]["labels"]={}.tap{|n|m.each{|k,v|n[k.gsub(/[.\/]/,'_')]=v}};end}
    _dummy3_ ${if m=record.dig("kubernetes","flat_labels");record["kubernetes"]["flat_labels"]=[].tap{|n|m.each_with_index{|s, i|n[i] = s.gsub(/[.\/]/,'_')}};end}
    </record>
    remove_keys _dummy_, _dummy2_, _dummy3_
  </filter>

The problem is there is a 'first one wins' issue where in one case the type should be an object and the other should be a scalar. Elasticsearch does dynamic mapping so it does not know how to treat these conflicting case. Additionally, unless they have fixed it on their side, Elasticsearch treats dot delimited keys as a JSON path. This is at least the case for v6 and prior. I seem to recall they provided a config or bug fix to handle that case.

We solve the issue by replacing these "path" bits with underscores which has proven to be the safer option. Note this is also what other observability tools have done (e.g. Prometheus, Loki) for likely the same issues. Its easier to tell someone to "convert your label to underscores when you query" then it is to try and do the escaping.