fluent / fluent-operator

Operate Fluent Bit and Fluentd in the Kubernetes way - Previously known as FluentBit Operator
Apache License 2.0
587 stars 250 forks source link

bug: Cannot load Lua filter config map from the namespace where Filter is deployed #1339

Closed alternaivan closed 2 months ago

alternaivan commented 2 months ago

Describe the issue

Hello,

I am trying to apply the Lua rate limit filter via Filter CRD and ConfigMap, however, the fluent-bit is throwing the following errors and it won't reload the config.

[error] [reload] reloaded config is invalid. Reloading is halted
[error] [reload] check properties and additonal vaildations for filter plugins is failed
[error] Failed pre_run callback on filter lua.24
[error] [filter_lua] filter cannot be loaded
[info] Config file changed, reloading...

These are the Filter and ConfigMap resources I'm deploying.

apiVersion: fluentbit.fluent.io/v1alpha2
kind: Filter
metadata:
  name: lua-filter
  namespace: default
  labels:
    fluentbit.fluent.io/component: logging
    fluentbit.fluent.io/enabled: "true"
spec:
  match: "kube.*"
  filters:
  - lua:
      call: "rate_limit"
      script:
        name: "rate-limit"
        key: "rate_limit.lua"
apiVersion: v1
data:
  rate_limit.lua: |
    --[[
       This Lua script is to do the rate limiting of logs based on some key. The Throttle filter in fluent-bit doesn't allow to do the rate limiting based on key
       sample configuration:
        [FILTER]
         Name lua
         Match kube.*
         script rate_limit.lua
         call rate_limit
    ]]

    local counter = {}
    local time = 0
    local group_key = "docker_id" -- Used to group logs. Groups are rate limited independently.
    local group_bucket_period_s = 60 -- This is the period of of time in seconds over which group_bucket_limit applies.
    local group_bucket_limit = 1000 -- Maximum number logs allowed per groups over the period of group_bucket_period_s.

    -- with above values, each and every containers running on the kubernetes will have a limit of 1000 logs for every 60 seconds since contianers have unique kubernetes.docker_id value

    local function get_current_time(timestamp)
        return math.floor(timestamp / group_bucket_period_s)
    end

    function rate_limit(tag, timestamp, record)
        local t = os.time()
        local current_time = get_current_time(t)
        if current_time ~= time then
            time = current_time
            counter = {} -- reset the counter
        end
        local counter_key = record["kubernetes"][group_key]
        local logs_count = counter[counter_key]
        if logs_count == nil then
            counter[counter_key] = 1
        else
            counter[counter_key] = logs_count + 1
            if counter[counter_key] > group_bucket_limit then -- check if the number of logs is greater than group_bucket_limit
                return -1, 0, 0 -- drop the log
            end
        end
        return 0, 0, 0 -- keep the log
    end
kind: ConfigMap
metadata:
  name: rate-limit
  namespace: default

Fluent Operator is deployed in the separate logging namespace. After checking the logs and the configuration, it seems that the operator doesn't see the config map from the default namespace.

The workaround for this solution would be to deploy the configmap in the logging namespace.

The proposed solution would be to extend the Filter CRD to include the namespace definition and to add the logic behind it, so it is able to read the ConfigMaps from the namespace where Filter is deployed into.

To Reproduce

Create above 2 resources (Filter and ConfigMap) in the default namespace and check the fluent-bit logs and configuration.

Expected behavior

Filter configuration is reloaded correctly, config map from the filter namespace is rendered, and fluent-bit starts operating as expected.

Your Environment

- Fluent Operator version: 3.0.0
- Container Runtime: containerd
- Operating system: Ubuntu 22.04
- Kernel version: 5.15.0-116-generic

How did you install fluent operator?

Via Helm.

Additional context

No response

cw-Guo commented 2 months ago

the configmap's namespace should be the same with the ClusterFluentBitConfig. see https://github.com/fluent/fluent-operator/blob/master/apis/fluentbit/v1alpha2/clusterfluentbitconfig_types.go#L49-L51

alternaivan commented 2 months ago

Hi @cw-Guo,

Thanks for the response! I understand that if you specify the namespace in the ClusterFluentBitConfig, the secrets and configmaps will be loaded from that namespace. However, this is not what we want to achieve.

We want to be able to add a lua Filter in the specific namespace e.g. default for which the script is defined as a ConfigMap in the same namespace as the Filter (e.g. default), while on the other hand default configmaps and secrets can be taken from the namespace where fluent-bit is deployed (e.g. logging).

Is it possible to do that?

Thanks, Marjan

cw-Guo commented 2 months ago

Hi @alternaivan , you can try FluentBitConfig, basically, the namespaced CR solution.

see https://github.com/fluent/fluent-operator/issues/580

alternaivan commented 2 months ago

Hi @cw-Guo,

I checked it out, and it doesn't have the namespace field as ClusterFluentBitConfig.

Am I missing something?

Thanks!

cw-Guo commented 2 months ago

Hi @cw-Guo,

I checked it out, and it doesn't have the namespace field as ClusterFluentBitConfig.

Am I missing something?

Thanks!

@alternaivan no, it has to be the same namespace as your configmap (default)

alternaivan commented 2 months ago

@alternaivan no, it has to be the same namespace as your configmap (default)

@cw-Guo, yes, we have it deployed and that is how we configure the filter selection. But, this doesn't help with ConfigMaps for lua scripts. The ConfigMaps will be read from the namespace where fluent-bit is running, not from the namespace where Filter is deployed. I wasn't able to find a way to make this work.

Here is the FluentBitConfig we deploy on the default namespace.

apiVersion: fluentbit.fluent.io/v1alpha2
kind: FluentBitConfig
metadata:
  name: fluent-bit-config
spec:
  filterSelector:
    matchLabels:
      fluentbit.fluent.io/component: logging
      fluentbit.fluent.io/enabled: "true"
  outputSelector:
    matchLabels:
      fluentbit.fluent.io/component: logging
      fluentbit.fluent.io/enabled: "true"
  parserSelector:
    matchLabels:
      fluentbit.fluent.io/component: logging
      fluentbit.fluent.io/enabled: "true"
cw-Guo commented 2 months ago

@alternaivan You are right, https://github.com/fluent/fluent-operator/blob/master/controllers/fluentbitconfig_controller.go#L183-L186 only cluster filers are processed for the lua scripts. there is a bug.