fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.85k stars 1.58k forks source link

Grep filter does not work on json complex structure #3185

Closed vDMG closed 3 years ago

vDMG commented 3 years ago

Bug Report

Issue with Grep Filter on complex JSON structure

I'm trying to filter logs from a file using the grep filter, I want fluent-bit to output only logs which contains the impersonatedUser key

To Reproduce

config-file:

[SERVICE]
    Flush           1
    Daemon          off
    Log_Level       debug
    Parsers_File    parsers.conf
    Plugins_File    plugins.conf
[INPUT]
    name   tail
    path   /tmp/kube-audit.log
    parser json

[FILTER]
    name   grep
    match  **
    regex  impersonatedUser .*

[OUTPUT]
    name   stdout
    match  **

log examples:

{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"26cc75b8-2406-4423-8d67-822e284338d4","stage":"ResponseStarted","requestURI":"/api/v1/secrets?includeObject=Object\u0026resourceVersion=72118354\u0026timeout=30m0s\u0026timeoutSeconds=1800\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"impersonatedUser":{"username":"u-toto","groups":["googleoauth_group://tata","system:authenticated","system:cattle:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"secrets","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","message":"Connection closed early","code":200},"requestReceivedTimestamp":"2021-02-25T17:37:22.861878Z","stageTimestamp":"2021-02-25T17:37:22.862892Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"clusterrolebinding-66jsq\" of ClusterRole \"cluster-owner\" to Group \"googleoauth_group://tata\""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"88ce1221-cef2-4e1c-8a1b-9597e58cecef","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/kube-system/configmaps/cattle-agent-controllers?timeout=15m0s","verb":"get","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"configmaps","namespace":"kube-system","name":"cattle-agent-controllers","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2021-03-04T00:12:58.417176Z","stageTimestamp":"2021-03-04T00:12:58.420912Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"cattle-admin-binding\" of ClusterRole \"cattle-admin\" to ServiceAccount \"cattle/cattle-system\""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"26cc75b8-2406-4423-8d67-822e284338d4","stage":"ResponseStarted","requestURI":"/api/v1/secrets?includeObject=Object\u0026resourceVersion=72118354\u0026timeout=30m0s\u0026timeoutSeconds=1800\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"impersonatedUser":{"username":"u-toto","groups":["googleoauth_group://tata","system:authenticated","system:cattle:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"secrets","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","message":"Connection closed early","code":200},"requestReceivedTimestamp":"2021-02-25T17:37:22.861878Z","stageTimestamp":"2021-02-25T17:37:22.862892Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"clusterrolebinding-66jsq\" of ClusterRole \"cluster-owner\" to Group \"googleoauth_group://tata\""}}
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"88ce1221-cef2-4e1c-8a1b-9597e58cecef","stage":"ResponseComplete","requestURI":"/api/v1/namespaces/kube-system/configmaps/cattle-agent-controllers?timeout=15m0s","verb":"get","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"configmaps","namespace":"kube-system","name":"cattle-agent-controllers","apiVersion":"v1"},"responseStatus":{"metadata":{},"code":200},"requestReceivedTimestamp":"2021-03-04T00:12:58.417176Z","stageTimestamp":"2021-03-04T00:12:58.420912Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"cattle-admin-binding\" of ClusterRole \"cattle-admin\" to ServiceAccount \"cattle/cattle-system\""}}

Expected behavior

Fluent-bit should output logs containing the field impersonatedUser

{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"26cc75b8-2406-4423-8d67-822e284338d4","stage":"ResponseStarted","requestURI":"/api/v1/secrets?includeObject=Object\u0026resourceVersion=72118354\u0026timeout=30m0s\u0026timeoutSeconds=1800\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"impersonatedUser":{"username":"u-toto","groups":["googleoauth_group://tata","system:authenticated","system:cattle:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"secrets","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","message":"Connection closed early","code":200},"requestReceivedTimestamp":"2021-02-25T17:37:22.861878Z","stageTimestamp":"2021-02-25T17:37:22.862892Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"clusterrolebinding-66jsq\" of ClusterRole \"cluster-owner\" to Group \"googleoauth_group://tata\""}}  
{"kind":"Event","apiVersion":"audit.k8s.io/v1","level":"Metadata","auditID":"26cc75b8-2406-4423-8d67-822e284338d4","stage":"ResponseStarted","requestURI":"/api/v1/secrets?includeObject=Object\u0026resourceVersion=72118354\u0026timeout=30m0s\u0026timeoutSeconds=1800\u0026watch=true","verb":"watch","user":{"username":"system:serviceaccount:cattle-system:cattle","uid":"90652f83-7689-45b2-b1aa-98116139785e","groups":["system:serviceaccounts","system:serviceaccounts:cattle-system","system:authenticated"]},"impersonatedUser":{"username":"u-toto","groups":["googleoauth_group://tata","system:authenticated","system:cattle:authenticated"]},"sourceIPs":["172.16.20.1"],"userAgent":"agent/v0.0.0 (linux/amd64) kubernetes/$Format","objectRef":{"resource":"secrets","apiVersion":"v1"},"responseStatus":{"metadata":{},"status":"Success","message":"Connection closed early","code":200},"requestReceivedTimestamp":"2021-02-25T17:37:22.861878Z","stageTimestamp":"2021-02-25T17:37:22.862892Z","annotations":{"authorization.k8s.io/decision":"allow","authorization.k8s.io/reason":"RBAC: allowed by ClusterRoleBinding \"clusterrolebinding-66jsq\" of ClusterRole \"cluster-owner\" to Group \"googleoauth_group://tata\""}}

Your Environment

Additional context

I've tried a lot of different config and settings to try to solve this, like using recordaccessor with $impersonatedUser or trying removing the upper case but impossible to make this works

agup006 commented 3 years ago

Hmm, I could repro this by using $impersonatedUser and the same regex. I'm wondering if this is failing because there are no fields for $impersonatedUser only a nest JSON. @nokute78 anything come to mind?

When I used the following grep filter it works as intended:

[FILTER]
    name grep
    match log
    regex $impersonatedUser['username'] .* 
nokute78 commented 3 years ago

@vDMG Could you test @agup006 's configuration ? I confirmed that fluent-bit outputted what you expected.

It is

[SERVICE]
    Flush           1
    Daemon          off
    Log_Level       debug
    Parsers_File    parsers.conf

[INPUT]
    name   tail
    path   a.log
    Read_From_Head on
    Parser json

[FILTER]
    name   grep
    match  *
    regex  $impersonatedUser['username'] .+

[OUTPUT]
    name   stdout
    match  **

@agup006 Hmm, I have no idea... The difference is come from the return value of subkey_to_object which is changed with record accessor or not. https://github.com/fluent/fluent-bit/blob/86f215a98250088189118547147ed1f2597a9b93/src/flb_ra_key.c#L344-L347

vDMG commented 3 years ago

Thanks @agup006 and @nokute78 I'll go with

regex  $impersonatedUser['username'] .+

It works :smile: