fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.86k stars 1.58k forks source link

spurious field mapping errors in ES output #1134

Closed cypherfox closed 2 years ago

cypherfox commented 5 years ago

Bug Report

Describe the bug

I see the error messages shown below in the log of my fluent-bit pods. according to kubectl get pods -l app --all-namespaces --show-labels we only use values for the applabel that match /[:alpha:][-:alphanum:]*[:alpha]/

I have no idea where the app object is coming from.

[2019/02/25 09:31:35] [ warn] [out_es] Elasticsearch error
{"took":2,"errors":true,"items":[{"index":{"_index":"kubernetes-2019.02.25","_type":"flb_type","_id":"nt_-I2kBNo3QnVfwzT9R","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"kubernetes-2019.02.25","_type":"flb_type","_id":"n9_-I2kBNo3QnVfwzT9R","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"kubernetes-2019.02.25","_type":"flb_type","_id":"oN_-I2kBNo3QnVfwzT9R","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"kubernetes-2019.02.25","_type":"flb_type","_id":"od_-I2kBNo3QnVfwzT9R","status":400,"error":{"

Expected behavior all fields of the app label to be unstructured strings.

Your Environment

my config map is:

apiVersion: v1
kind: ConfigMap
metadata:
  name: logging-fluent-bit-config
  labels:
    app: fluent-bit
    chart: fluent-bit-1.8.0
    heritage: Tiller
    release: logging
data:
  fluent-bit-service.conf: |-
    [SERVICE]
        Flush        1
        Daemon       Off
        Log_Level    info
        Parsers_File parsers.conf
        Parsers_File parsers_custom.conf
        HTTP_Server  On
        HTTP_Listen  0.0.0.0
        HTTP_Port    2020

  fluent-bit-input.conf: |-
    [INPUT]
        Name             tail
        Path             /var/log/containers/*.log
        Parser           docker
        Tag              kube.*
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On
        DB               /tail-db/tail-containers-state.db
        DB.Sync          Normal
    [INPUT]
        Name            systemd
        Tag             systemd.*
        Systemd_Filter  _SYSTEMD_UNIT="docker.service"
        Systemd_Filter  _SYSTEMD_UNIT="kubelet.service"
        Systemd_Filter  _SYSTEMD_UNIT="etcd-member.service"
        Systemd_Filter  _SYSTEMD_UNIT="etcd-backup.service"
        Systemd_Filter  _SYSTEMD_UNIT="consul.service"
        Systemd_Filter  _SYSTEMD_UNIT="consul-backup.service"
        Systemd_Filter  _SYSTEMD_UNIT="vault.service"
        Max_Entries     1000
        Read_From_Tail  true
    [INPUT]
        Name            syslog
        Mode            udp
        Port            5140
        Tag             neuvector.*
        Parser          syslog-neuvector
        Chunk_Size      32
        Buffer_Size     64
    [INPUT]
        Name            systemd
        Tag             audit.system.*
        Systemd_Filter  _TRANSPORT=audit
        DB               /tail-db/tail-audit-system-state.db
        DB.Sync          Normal
    [INPUT]
        Name            systemd
        Tag             audit.vault.*
        Systemd_Filter  SYSLOG_IDENTIFIER=vault-audit
        DB               /tail-db/tail-audit-vault-state.db
        DB.Sync          Normal
    [INPUT]
        Name             tail
        Path             /var/log/audit/apiserver/apiserver.log
        Tag              audit.apiserver.*
        Parser           parse-audit-apiserver
        Refresh_Interval 5
        Mem_Buf_Limit    5MB
        Skip_Long_Lines  On
        DB               /tail-db/tail-audit-apiserver-state.db
        DB.Sync          Normal

  fluent-bit-filter.conf: |-
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On
    # suppres structured app labels
    [FILTER]
        Name             record_modifier
        Match            kube.*

    # this section is wrongly indented by helm chart

  fluent-bit-output.conf: |-

    [OUTPUT]
        Name  es-kubernetes
        Match kube.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix kubernetes
    [OUTPUT]
        Name  es-systemd
        Match systemd.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix systemd
    [OUTPUT]
        Name  es-autit-system
        Match audit.system.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix audit-system
    [OUTPUT]
        Name  es-audit-vault
        Match audit.vault.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix audit-vault
    [OUTPUT]
        Name  es-audit-apiserver
        Match audit.apiserver.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix audit-apiserver
    [OUTPUT]
        Name  es-neuvector
        Match neuvector.*
        Host  logging-elasticsearch-client
        Port  9200
        Logstash_Format On
        Retry_Limit False
        Type  flb_type
        Logstash_Prefix neuvector

  fluent-bit.conf: |-
    @INCLUDE fluent-bit-service.conf
    @INCLUDE fluent-bit-input.conf
    @INCLUDE fluent-bit-filter.conf
    @INCLUDE fluent-bit-output.conf

  parsers.conf: |-
    [PARSER]
        Name        syslog-neuvector
        Format      regex
        Regex       /^\<(?<priority>[0-9]+)\>(?<time>[^ Z]*)Z (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?\:? *(?<full_message>(?:(?:notification=(?<notification>[^,]*))|(?:name=(?<name>[^,]*))|(?:level=(?<level>[^,]*))|(?:reported_timestamp=(?<reported_timestamp>[^,]*))|(?:reported_at=(?<reported_at>[^,]*))|(?:cluster_name=(?<cluster_name>[^,]*))|(?:host_id=(?<host_id>[^,]*))|(?:host_name=(?<host_name>[^,]*))|(?:controller_id=(?<controller_id>[^,]*))|(?:controller_name=(?<controller_name>[^,]*))|(?:enforcer_id=(?<enforcer_id>[^,]*))|(?:enforcer_name=(?<enforcer_name>[^,]*))|(?:workload_id=(?<workload_id>[^,]*))|(?:workload_name=(?<workload_name>[^,]*))|(?:workload_domain=(?<workload_domain>[^,]*))|(?:workload_image=(?<workload_image>[^,]*))|(?:category=(?<category>[^,]*))|(?:user=(?<user>[^,]*))|(?:user_roles=(?<user_roles>[^,]*))|(?:user_addr=(?<user_addr>[^,]*))|(?:user_session=(?<user_session>[^,]*))|(?:message=(?<message>[^,]*))|[^,=]+=[^,=]*|,+)*)$/
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S

    [PARSER]
        Name        parse-audit-apiserver
        Format      json
        Time_Key    time
        Time_Format %Y-%m-%dT%H:%M:%S %z
        Decode_Field json log
cypherfox commented 5 years ago

Additionally: there are no log message from the fluent-bit pods themselves. Is this intentional? All other pods seem to work.

pichouk commented 5 years ago

You can find an explanation here. Beats solved it by replacing dots in labels with _.

andreykaipov commented 5 years ago

We solved this by adding a Lua filter after the Kubernetes filter:

[FILTER]
    Name    lua
    Match   kube.*
    script  /fluent-bit/etc/functions.lua
    call    dedot
function dedot(tag, timestamp, record)
    if record["kubernetes"] == nil then
        return 0, 0, 0
    end
    dedot_keys(record["kubernetes"]["annotations"])
    dedot_keys(record["kubernetes"]["labels"])
    return 1, timestamp, record
end

function dedot_keys(map)
    if map == nil then
        return
    end
    for k, v in pairs(map) do
        dedotted = string.gsub(k, "%.", "_")
        if k ~= dedotted then
            map[dedotted] = v
            map[k] = nil
        end
    end
end

It would be cool to have this support natively though.

simonwh commented 5 years ago

@andreykaipov Thanks for providing your script! When I use it, paris(map) fails becausemapis nil indedot_keys. Did you apply this filter before or after thekubernetes` filter? (I did after)

andreykaipov commented 5 years ago

Hi @simonwh - we actually ran into it that yesterday too. It was working okay until a pod with no annotations or labels started running, which is rare enough I didn't think it was possible!

I've edited my original post to add a nil check to dedot_keys to return early if the map is nil so that Fluent Bit doesn't crash on those kinds of records. That got it working for us again.

edsiper commented 5 years ago

@andreykaipov how can I reproduce the crash ?

andreykaipov commented 5 years ago

Hi @edsiper, the crash I was referring to isn't anything wrong with Fluent Bit. It was just my original Lua script not accounting for pods with an empty annotations or labels map.

PANIC: unprotected error in call to Lua API (/fluent-bit/etc/functions.lua:19: bad argument #1 to 'pairs' (table expected, got nil))
gmlexx commented 5 years ago

Hi @andreykaipov I believe that your dedot function works fine, but for some reason my json annotations just partially transformed

{
    "kubernetes": {
        "annotations": {
            "cluster-autoscaler_kubernetes_io_safe-to-evict": "true",
            "kubernetes.io/limit-ranger": "LimitRanger plugin set: cpu request for init container assets"
        },
        "container_name": "saml",
        "docker_id": "1aefe0b3ef8306dfaa411a7c5188c51679be651d68b313fa3635c0a63ce701eb",
        "host": "ip-172-16-40-62.ec2.internal",
        "labels": {
            "app": "login",
            "pod-template-hash": "74c68f9d5",
            "release": "test-login"
        },
        "namespace_name": "default",
        "pod_id": "eb746a32-ecd0-11e9-8d9a-0ed429943b82",
        "pod_name": "test-login-74c68f9d5-lmfnq"
    },
    "log": "172.16.84.128 - - [24/Oct/2019 12:54:47] \"GET /test/ HTTP/1.1\" 401 -\n",
    "stream": "stderr",
    "time": "2019-10-24T12:54:47.10424059Z"
}

I've tested function locally and it works properly. In fluent-bit configuration there are only 2 filters:

[FILTER]
    Name                kubernetes
    Match               kube.*
    Kube_Tag_Prefix     kube.var.log.containers.
    Kube_URL            https://kubernetes.default.svc:443
    Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
    Merge_Log           On
    K8S-Logging.Parser  On
    K8S-Logging.Exclude On

[FILTER]
    Name    lua
    Match   kube.*
    script  /fluent-bit/etc/functions.lua
    call    dedot

Do you have any clue, why that may happen?

gmlexx commented 5 years ago

It seems that changing map during iteration is a bad idea

mtparet commented 4 years ago

Run into the same issue, the lua function does not seems to be working, i'm investigating.

gmlexx commented 4 years ago

@mtparet I rewrote it into this

function dedot(tag, timestamp, record)
    if record["kubernetes"] == nil then
        return 0, 0, 0
    end
    dedot_keys(record["kubernetes"]["annotations"])
    dedot_keys(record["kubernetes"]["labels"])
    return 1, timestamp, record
end

function dedot_keys(map)
    if map == nil then
        return
    end
    local new_map = {}
    local changed_keys = {}
    for k, v in pairs(map) do
        local deslashed = string.gsub(k, "%/", "_")
        local dedotted = string.gsub(deslashed, "%.", "_")
        if dedotted ~= k then
            new_map[dedotted] = v
            changed_keys[k] = true
        end
    end
    for k in pairs(changed_keys) do
        map[k] = nil
    end
    for k, v in pairs(new_map) do
        map[k] = v
    end
end
BarthV commented 4 years ago

:+1: please include a native dedot option in fluent-bit (and not only in ES plugin)

den-is commented 4 years ago

+1 just encountered the same issue. Whole logging is broken now. That has started when I've tried to enable "recommended" kubernetes labels. 1 2 Extremely appreciate guys who have developed and shared their scripts!!! Thanks!!!

nullck commented 4 years ago

@mtparet I rewrote it into this

function dedot(tag, timestamp, record)
    if record["kubernetes"] == nil then
        return 0, 0, 0
    end
    dedot_keys(record["kubernetes"]["annotations"])
    dedot_keys(record["kubernetes"]["labels"])
    return 1, timestamp, record
end

function dedot_keys(map)
    if map == nil then
        return
    end
    local new_map = {}
    local changed_keys = {}
    for k, v in pairs(map) do
        local deslashed = string.gsub(k, "%/", "_")
        local dedotted = string.gsub(deslashed, "%.", "_")
        if dedotted ~= k then
            new_map[dedotted] = v
            changed_keys[k] = true
        end
    end
    for k in pairs(changed_keys) do
        map[k] = nil
    end
    for k, v in pairs(new_map) do
        map[k] = v
    end
end

@gmlexx Thanks a lot! worked here.

@edsiper this lua script that @gmlexx / @andreykaipov made is very useful, could we create a kind of template and add in the code examples? https://github.com/fluent/fluent-bit/tree/master/scripts

edsiper commented 4 years ago

for now, we can add that script into our scripts directory, anyone wants to submit the PR?

tirelibirefe commented 4 years ago

is there a workaround for this?

The filter cannot be used as described above as Fluent-bit is installed via Helm. Containers look for scripts inside, of course there is no script.

[FILTER]
    Name    lua
    Match   kube.*
    script  /fluent-bit/etc/functions.lua
    call    dedot

I will be very appreciated if you can advise how to make it run on Kubernetes environment.

BarthV commented 4 years ago

@tirelibirefe

The filter cannot be used as described above ...

k8s questions are a bit out of scope here. And it seems that you just don't know how to do it ;)

You can simply mount a configMap with lua script and fluent-bit config. and load it in rawConfig chart value.

There's few details from my own config here : https://gist.github.com/BarthV/17521ff8cc5c5b3b704c9e2f491c7e60

BarthV commented 4 years ago

But, we're still waiting to see dedot option to be included upstream :'(

tirelibirefe commented 4 years ago

k8s questions are a bit out of scope here. And it seems that you just don't know how to do it ;)

@BarthV I cannot say you're wrong ;) You helped me, I'm very appreciated. Thank you!

mikelog commented 3 years ago

@mtparet I rewrote it into this

function dedot(tag, timestamp, record)
    if record["kubernetes"] == nil then
        return 0, 0, 0
    end
    dedot_keys(record["kubernetes"]["annotations"])
    dedot_keys(record["kubernetes"]["labels"])
    return 1, timestamp, record
end

function dedot_keys(map)
    if map == nil then
        return
    end
    local new_map = {}
    local changed_keys = {}
    for k, v in pairs(map) do
        local deslashed = string.gsub(k, "%/", "_")
        local dedotted = string.gsub(deslashed, "%.", "_")
        if dedotted ~= k then
            new_map[dedotted] = v
            changed_keys[k] = true
        end
    end
    for k in pairs(changed_keys) do
        map[k] = nil
    end
    for k, v in pairs(new_map) do
        map[k] = v
    end
end

thanks, worked!

zerthimon commented 3 years ago

The lua script above doesn't work for me :( Perhaps I'm doing something wrong ?

    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc.cluster.local:443
        Merge_Log           On
        Merge_Log_Key       data
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On

    [FILTER]
        Name    lua
        Match   kube.*
        script  /fluent-bit/lua/dedot.lua
        call    dedot
[2021/02/12 16:48:35] [error] [output:es:es.0] error: Output
{"took":7,"errors":true,"items":[{"index":{"_index":"fluent-bit","_type":"_doc","_id":"vpEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"v5Ekl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"wJEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"wZEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"wpEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"w5Ekl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"xJEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"xZEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"xpEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"x5Ekl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}},{"index":{"_index":"fluent-bit","_type":"_doc","_id":"yJEkl3cB4ZJelzut7H1Q","status":400,"error":{"type":"mapper_parsing_exception","reason":"object mapping for [kubernetes.labels.app] tried to parse field [app] as object, but found a concrete value"}}}]}
[2021/02/12 16:48:35] [ warn] [engine] failed to flush chunk '1-1613148515.119762423.flb', retry in 6 seconds: task_id=9, input=tail.0 > output=es.0
zerthimon commented 3 years ago

FluentBit has an output ES config setting "Replace_Dots", will that solve the problem ? https://docs.fluentbit.io/manual/pipeline/outputs/elasticsearch Replace_Dots When enabled, replace field name dots with underscore, required by Elasticsearch 2.0-2.3.

zerthimon commented 3 years ago

Fluent Bit v1.6.10 with Replace_Dots in ES output solved the problem for me without using Lua.

zystem commented 3 years ago

What to do users which use s3 or Kafka as transport? Can anybody add Replace_Dots as a filter?

github-actions[bot] commented 2 years ago

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days. Maintainers can add the exempt-stale label.

github-actions[bot] commented 2 years ago

This issue was closed because it has been stalled for 5 days with no activity.

squaricdot commented 1 year ago

@mtparet I rewrote it into this

function dedot(tag, timestamp, record)
    if record["kubernetes"] == nil then
        return 0, 0, 0
    end
    dedot_keys(record["kubernetes"]["annotations"])
    dedot_keys(record["kubernetes"]["labels"])
    return 1, timestamp, record
end

function dedot_keys(map)
    if map == nil then
        return
    end
    local new_map = {}
    local changed_keys = {}
    for k, v in pairs(map) do
        local deslashed = string.gsub(k, "%/", "_")
        local dedotted = string.gsub(deslashed, "%.", "_")
        if dedotted ~= k then
            new_map[dedotted] = v
            changed_keys[k] = true
        end
    end
    for k in pairs(changed_keys) do
        map[k] = nil
    end
    for k, v in pairs(new_map) do
        map[k] = v
    end
end

@gmlexx Thanks a lot! worked here.

@edsiper this lua script that @gmlexx / @andreykaipov made is very useful, could we create a kind of template and add in the code examples? https://github.com/fluent/fluent-bit/tree/master/scripts

I used this too. slightly adjusted:

      function dedot(tag, timestamp, record)
          check = 0
          if not (record["k8s_annotations"] == nil) then
              check = 1
              dedot_keys(record["k8s_annotations"])
          end
          if not (record["k8s_labels"] == nil) then
              check = 1
              dedot_keys(record["k8s_labels"])
          end
          if check == 1 then
              return 1, timestamp, record
          else
              return 0, 0, 0
          end
      end

Labels are sometimes not parsed and it's pretty baffling me.

I use k8s instead of kubernetes as prefix:

  "k8s_labels": {
    "app_kubernetes_io/version": "vx.x.x",
    "app_kubernetes_io/instance": "kyverno",
    "app_kubernetes_io/managed-by": "Helm",
    "app.kubernetes.io/name": "kyverno",
    "helm_sh/chart": "kyverno-vx.x.x",
    "pod-template-hash": "86d9d7ddcf",
    "app_kubernetes_io/part-of": "kyverno",
    "app": "kyverno",
    "app_kubernetes_io/component": "kyverno"
  }

in this case just app.kubernetes.io/name is not parsed with underscores.

so far this issue was one of the best finds about this topic yet.

shikharagarwal19 commented 1 year ago

I was also facing the same issue, modified above mentioned lua script with some tweaks, one can find my solution here here

Hope it helps