fabric8io / fluent-plugin-kubernetes_metadata_filter

Enrich your fluentd events with Kubernetes metadata
Apache License 2.0
351 stars 166 forks source link

Improve memory usage #170

Closed janario closed 3 years ago

janario commented 5 years ago

We've had a scenario where there were a lot of pod(started from cron) in Pending state. Around 9k

I know it was an internal scenario problem, but because of that we notice that fluentd started to crash (oomkiller)

Trying to understand the scenario and removing parts of our fluentd configuration we notice that filter_metadata was the problem and because we had too many pods

It would be good to have a low memory consumption

richm commented 5 years ago

We've had a scenario where there were a lot of pod(started from cron) in Pending state. Around 9k

I know it was an internal scenario problem, but because of that we notice that fluentd started to crash (oomkiller)

Trying to understand the scenario and removing parts of our fluentd configuration we notice that filter_metadata was the problem and because we had too many pods

How do you know it was the fluent-plugin-kubernetes_metadata_filter which was the problem? Was the OOM kill stacktrace in this plugin code?

It would be good to have a low memory consumption

Have you tried adjusting cache_size and cache_ttl?

janario commented 5 years ago

I started to remove piece by piece from my fluentd.conf

Basically my config is:

<source> for apps
<source> for kube-system
<source> for one file not managed by kubernetes

<filter kubemetadata for apps and kube-system
<match to send to cloudwatch logs

removing the sources it kept crashing, then I removed just the filter and it stopped to crash

I didn't try with the cache_size options


my oom-killer logs

Apr 24 11:51:07 ip-10-0-74-72 kernel: filter_kuberne* invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=975
Apr 24 11:51:07 ip-10-0-74-72 kernel: filter_kuberne* cpuset=4dd475111b9c266dadd3132047c2baba4f7afe2ec9e7d895d0efe76f9806d0cd mems_allowed=0
Apr 24 11:51:07 ip-10-0-74-72 kernel: CPU: 1 PID: 28750 Comm: filter_kuberne* Not tainted 4.14.97-90.72.amzn2.x86_64 #1
Apr 24 11:51:07 ip-10-0-74-72 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
Apr 24 11:51:07 ip-10-0-74-72 kernel: Call Trace:
Apr 24 11:51:07 ip-10-0-74-72 kernel: dump_stack+0x5c/0x82
Apr 24 11:51:07 ip-10-0-74-72 kernel: dump_header+0x94/0x229
Apr 24 11:51:07 ip-10-0-74-72 kernel: oom_kill_process+0x213/0x410
Apr 24 11:51:07 ip-10-0-74-72 kernel: out_of_memory+0x2af/0x4d0
Apr 24 11:51:07 ip-10-0-74-72 kernel: mem_cgroup_out_of_memory+0x49/0x80
Apr 24 11:51:07 ip-10-0-74-72 kernel: mem_cgroup_oom_synchronize+0x2ed/0x330
Apr 24 11:51:07 ip-10-0-74-72 kernel: ? mem_cgroup_css_online+0x30/0x30
Apr 24 11:51:07 ip-10-0-74-72 kernel: pagefault_out_of_memory+0x32/0x77
Apr 24 11:51:07 ip-10-0-74-72 kernel: __do_page_fault+0x4b4/0x4c0
Apr 24 11:51:07 ip-10-0-74-72 kernel: ? page_fault+0x2f/0x50
Apr 24 11:51:07 ip-10-0-74-72 kernel: page_fault+0x45/0x50
Apr 24 11:51:07 ip-10-0-74-72 kernel: RIP: 4000:0xffffffffffffffff
Apr 24 11:51:07 ip-10-0-74-72 kernel: RSP: 1700000:00007f268a5fde78 EFLAGS: 7f2688dd9000
Apr 24 11:51:07 ip-10-0-74-72 kernel: Task in /kubepods/burstable/podc43ee131-6686-11e9-8e21-06eeaf1192dc/4dd475111b9c266dadd3132047c2baba4f7afe2ec9e7d895d0efe76f9806d0cd killed as a result of limit of /kubepods/burstable/podc43ee131-6686-11e9-8e21-06eeaf1192dc
Apr 24 11:51:07 ip-10-0-74-72 kernel: memory: usage 524288kB, limit 524288kB, failcnt 1864
Apr 24 11:51:07 ip-10-0-74-72 kernel: memory+swap: usage 524288kB, limit 9007199254740988kB, failcnt 0
Apr 24 11:51:07 ip-10-0-74-72 kernel: kmem: usage 5152kB, limit 9007199254740988kB, failcnt 0
Apr 24 11:51:07 ip-10-0-74-72 kernel: Memory cgroup stats for /kubepods/burstable/podc43ee131-6686-11e9-8e21-06eeaf1192dc: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 11:51:07 ip-10-0-74-72 kernel: Memory cgroup stats for /kubepods/burstable/podc43ee131-6686-11e9-8e21-06eeaf1192dc/3ac28d0df3c378e3218f0e2df1a3993aacb2d229b0b7f540a48f4f87d64eed61: cache:0KB rss:44KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:44KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 11:51:07 ip-10-0-74-72 kernel: Memory cgroup stats for /kubepods/burstable/podc43ee131-6686-11e9-8e21-06eeaf1192dc/4dd475111b9c266dadd3132047c2baba4f7afe2ec9e7d895d0efe76f9806d0cd: cache:0KB rss:519092KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:519092KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 11:51:07 ip-10-0-74-72 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Apr 24 11:51:07 ip-10-0-74-72 kernel: [26696]     0 26696      256        1       4       2        0          -998 pause
Apr 24 11:51:07 ip-10-0-74-72 kernel: [26957]     0 26957     4170      312      11       3        0           975 tini
Apr 24 11:51:07 ip-10-0-74-72 kernel: [26985]     0 26985   154525   111157     296       4        0           975 ruby2.3
Apr 24 11:51:07 ip-10-0-74-72 kernel: [27107]     0 27107     6687      532      18       3        0           975 bash
Apr 24 11:51:07 ip-10-0-74-72 kernel: [27610]     0 27610    64586    22055     130       3        0           975 ruby2.3
Apr 24 11:51:07 ip-10-0-74-72 kernel: Memory cgroup out of memory: Kill process 26985 (ruby2.3) score 1824 or sacrifice child
Apr 24 11:51:07 ip-10-0-74-72 kernel: Killed process 27610 (ruby2.3) total-vm:258344kB, anon-rss:80628kB, file-rss:7592kB, shmem-rss:0kB
Apr 24 11:51:10 ip-10-0-74-72 kernel: oom_reaper: reaped process 27610 (ruby2.3), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
richm commented 5 years ago

I'm closing this issue. Please open a new issue if cache_size and cache_ttl do not solve the oom problem.

janario commented 5 years ago

Ok, tried with cache_size and cache_ttl. Same error

I created the 5k pods with

pending-pod.yaml

#  kubectl create ns pending
#  kubectl apply -f pending-pod.yaml
# wait until 5k kubectl -n pending get pods | wc -l
apiVersion: apps/v1
kind: Deployment
metadata:
  name: pending
  namespace: pending
spec:
  replicas: 5000
  selector:
    matchLabels:
      app: pending
  template:
    metadata:
      labels:
        app: pending
    spec:
      containers:
        - name: pending
          image: fluent/fluentd-kubernetes-daemonset:v1.3-debian-cloudwatch-1
          volumeMounts:
            - name: vol
              mountPath: /invalid
              subPath: invalid
      volumes:
        - name: vol
          persistentVolumeClaim:
            claimName: invalid-pvc

And waited until all of them are created. (they will be on state pending, but what maters here is the quantity of pods)

then I created a pod with fluentd and only kubernetes metadata

fluentd-pod.yaml

# kubectl create ns fluentd
# kubectl apply -f fluentd-pod.yaml
# it will take some time to start
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluentd
  namespace: fluentd
data:
  fluent.conf: |
    <match fluent.**>
      @type null
    </match>

    <filter apps.**>
      @type kubernetes_metadata
      cache_size 10
      #cache_ttl 60
    </filter>
---

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: fluentd
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  namespace: fluentd
rules:
  - apiGroups: [""]
    resources: ["namespaces", "pods"]
    verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: fluentd
  namespace: fluentd
subjects:
  - kind: ServiceAccount
    name: fluentd
    namespace: fluentd
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: fluentd
---
apiVersion: v1
kind: Pod
metadata:
  name: fluentd
  namespace: fluentd
spec:
  serviceAccountName: fluentd
  initContainers:
    - name: copy-fluentd-config
      image: busybox
      command: ['sh', '-c', 'cp /config-volume/* /etc/fluentd']
      volumeMounts:
        - mountPath: /config-volume
          name: config-volume
        - mountPath: /etc/fluentd
          name: config
  containers:
    - name: fluentd
      image: fluent/fluentd-kubernetes-daemonset:v1.3-debian-cloudwatch-1
      imagePullPolicy: Always
      env:
        - name: AWS_REGION
          value: eu-central-1
        - name: LOG_GROUP_NAME
          value: kubernetes
        - name: FLUENT_UID
          value: "0"

        - name: RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR
          value: "0.8"
        - name: K8S_NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
        - name: K8S_POD_NAME
          valueFrom:
            fieldRef:
              fieldPath: metadata.name
        - name: K8S_NODE_NAME
          valueFrom:
            fieldRef:
              fieldPath: spec.nodeName
      resources:
        limits:
          cpu: 100m
          memory: 300Mi
      volumeMounts:
        - name: config
          mountPath: /fluentd/etc
  terminationGracePeriodSeconds: 30
  volumes:
    - name: config
      emptyDir: {}
    - name: config-volume
      configMap:
        name: fluentd

Because of the quantity of pods, the cluster starts to be a little slow, fluentd will start in around 2m and crash after 2 minutes running

After the restart you can see on describe of the pod

   Last State:     Terminated
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Wed, 24 Apr 2019 23:22:38 +0200
      Finished:     Wed, 24 Apr 2019 23:23:51 +0200

and at the logs (in my case /var/log/messages)

Apr 24 21:23:46 ip-172-20-4-124 kernel: filter_kuberne* invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=-998
Apr 24 21:23:46 ip-172-20-4-124 kernel: filter_kuberne* cpuset=7d1ef59f779cc7ac6c7de2926e06b081c09765403d827b56f26f43a58364fa4a mems_allowed=0
Apr 24 21:23:46 ip-172-20-4-124 kernel: CPU: 0 PID: 24366 Comm: filter_kuberne* Not tainted 4.14.97-90.72.amzn2.x86_64 #1
Apr 24 21:23:46 ip-172-20-4-124 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
Apr 24 21:23:46 ip-172-20-4-124 kernel: Call Trace:
Apr 24 21:23:46 ip-172-20-4-124 kernel: dump_stack+0x5c/0x82
Apr 24 21:23:46 ip-172-20-4-124 kernel: dump_header+0x94/0x229
Apr 24 21:23:46 ip-172-20-4-124 kernel: oom_kill_process+0x213/0x410
Apr 24 21:23:46 ip-172-20-4-124 kernel: out_of_memory+0x2af/0x4d0
Apr 24 21:23:46 ip-172-20-4-124 kernel: mem_cgroup_out_of_memory+0x49/0x80
Apr 24 21:23:46 ip-172-20-4-124 kernel: mem_cgroup_oom_synchronize+0x2ed/0x330
Apr 24 21:23:46 ip-172-20-4-124 kernel: ? mem_cgroup_css_online+0x30/0x30
Apr 24 21:23:46 ip-172-20-4-124 kernel: pagefault_out_of_memory+0x32/0x77
Apr 24 21:23:46 ip-172-20-4-124 kernel: __do_page_fault+0x4b4/0x4c0
Apr 24 21:23:46 ip-172-20-4-124 kernel: ? page_fault+0x2f/0x50
Apr 24 21:23:46 ip-172-20-4-124 kernel: page_fault+0x45/0x50
Apr 24 21:23:46 ip-172-20-4-124 kernel: RIP: af416800:0x7f7e9c380fc0
Apr 24 21:23:46 ip-172-20-4-124 kernel: RSP: 0120:00007f7ea5806000 EFLAGS: 000000a0
Apr 24 21:23:46 ip-172-20-4-124 kernel: Task in /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72/c8bb0194b765371387720dce02218838b80c9c9320bf9a1ab6812d8ef209e17f killed as a result of limit of /kubepods/pod709a
51df-66d6-11e9-8111-027eee44ea72
Apr 24 21:23:46 ip-172-20-4-124 kernel: memory: usage 307200kB, limit 307200kB, failcnt 96
Apr 24 21:23:46 ip-172-20-4-124 kernel: memory+swap: usage 307200kB, limit 9007199254740988kB, failcnt 0
Apr 24 21:23:46 ip-172-20-4-124 kernel: kmem: usage 3800kB, limit 9007199254740988kB, failcnt 0
Apr 24 21:23:46 ip-172-20-4-124 kernel: Memory cgroup stats for /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inacti
ve_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 21:23:46 ip-172-20-4-124 kernel: Memory cgroup stats for /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72/7d1ef59f779cc7ac6c7de2926e06b081c09765403d827b56f26f43a58364fa4a: cache:0KB rss:303356KB rss_huge
:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:303356KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 21:23:46 ip-172-20-4-124 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Apr 24 21:23:46 ip-172-20-4-124 kernel: [24043]     0 24043      256        1       4       2        0          -998 pause
Apr 24 21:23:46 ip-172-20-4-124 kernel: [24285]     0 24285     4170      349      11       3        0          -998 tini
Apr 24 21:23:46 ip-172-20-4-124 kernel: [24308]     0 24308    79101    42881     151       3        0          -998 ruby2.3
Apr 24 21:23:46 ip-172-20-4-124 kernel: [24372]     0 24372    71192    36727     139       4        0          -998 ruby2.3
Apr 24 21:23:46 ip-172-20-4-124 kernel: Memory cgroup out of memory: Kill process 24043 (pause) score 0 or sacrifice child
Apr 24 21:23:46 ip-172-20-4-124 kernel: Killed process 24043 (pause) total-vm:1024kB, anon-rss:4kB, file-rss:0kB, shmem-rss:0kB
Apr 24 21:23:48 ip-172-20-4-124 kernel: oom_reaper: reaped process 24043 (pause), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Apr 24 21:23:49 ip-172-20-4-124 kernel: filter_kuberne* invoked oom-killer: gfp_mask=0x14000c0(GFP_KERNEL), nodemask=(null),  order=0, oom_score_adj=-998
Apr 24 21:23:49 ip-172-20-4-124 kernel: filter_kuberne* cpuset=7d1ef59f779cc7ac6c7de2926e06b081c09765403d827b56f26f43a58364fa4a mems_allowed=0
Apr 24 21:23:49 ip-172-20-4-124 kernel: CPU: 0 PID: 24366 Comm: filter_kuberne* Not tainted 4.14.97-90.72.amzn2.x86_64 #1
Apr 24 21:23:49 ip-172-20-4-124 kernel: Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
Apr 24 21:23:49 ip-172-20-4-124 kernel: Call Trace:
Apr 24 21:23:49 ip-172-20-4-124 kernel: dump_stack+0x5c/0x82
Apr 24 21:23:49 ip-172-20-4-124 kernel: dump_header+0x94/0x229
Apr 24 21:23:49 ip-172-20-4-124 kernel: oom_kill_process+0x213/0x410
Apr 24 21:23:49 ip-172-20-4-124 kernel: out_of_memory+0x2af/0x4d0
Apr 24 21:23:49 ip-172-20-4-124 kernel: mem_cgroup_out_of_memory+0x49/0x80
Apr 24 21:23:49 ip-172-20-4-124 kernel: mem_cgroup_oom_synchronize+0x2ed/0x330
Apr 24 21:23:49 ip-172-20-4-124 kernel: ? mem_cgroup_css_online+0x30/0x30
Apr 24 21:23:49 ip-172-20-4-124 kernel: pagefault_out_of_memory+0x32/0x77
Apr 24 21:23:49 ip-172-20-4-124 kernel: __do_page_fault+0x4b4/0x4c0
Apr 24 21:23:49 ip-172-20-4-124 kernel: ? page_fault+0x2f/0x50
Apr 24 21:23:49 ip-172-20-4-124 kernel: page_fault+0x45/0x50
Apr 24 21:23:49 ip-172-20-4-124 kernel: RIP: af416800:0x7f7e9c38afc0
Apr 24 21:23:49 ip-172-20-4-124 kernel: RSP: 0120:00007f7ea5806000 EFLAGS: 000000a0
Apr 24 21:23:49 ip-172-20-4-124 kernel: Task in /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72/7d1ef59f779cc7ac6c7de2926e06b081c09765403d827b56f26f43a58364fa4a killed as a result of limit of /kubepods/pod709a
51df-66d6-11e9-8111-027eee44ea72
Apr 24 21:23:49 ip-172-20-4-124 kernel: memory: usage 307200kB, limit 307200kB, failcnt 132
Apr 24 21:23:49 ip-172-20-4-124 kernel: memory+swap: usage 307200kB, limit 9007199254740988kB, failcnt 0
Apr 24 21:23:49 ip-172-20-4-124 kernel: kmem: usage 3768kB, limit 9007199254740988kB, failcnt 0
Apr 24 21:23:49 ip-172-20-4-124 kernel: Memory cgroup stats for /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72: cache:0KB rss:0KB rss_huge:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inacti
ve_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 21:23:49 ip-172-20-4-124 kernel: Memory cgroup stats for /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72/c8bb0194b765371387720dce02218838b80c9c9320bf9a1ab6812d8ef209e17f: cache:0KB rss:0KB rss_huge:0KB 
shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:0KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 21:23:49 ip-172-20-4-124 kernel: Memory cgroup stats for /kubepods/pod709a51df-66d6-11e9-8111-027eee44ea72/7d1ef59f779cc7ac6c7de2926e06b081c09765403d827b56f26f43a58364fa4a: cache:0KB rss:303432KB rss_huge
:0KB shmem:0KB mapped_file:0KB dirty:0KB writeback:0KB swap:0KB inactive_anon:0KB active_anon:303432KB inactive_file:0KB active_file:0KB unevictable:0KB
Apr 24 21:23:49 ip-172-20-4-124 kernel: [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Apr 24 21:23:49 ip-172-20-4-124 kernel: [24285]     0 24285     4170      349      11       3        0          -998 tini
Apr 24 21:23:49 ip-172-20-4-124 kernel: [24308]     0 24308    79101    42881     151       3        0          -998 ruby2.3
Apr 24 21:23:49 ip-172-20-4-124 kernel: [24372]     0 24372    71192    36727     139       4        0          -998 ruby2.3
Apr 24 21:23:49 ip-172-20-4-124 kernel: Memory cgroup out of memory: Kill process 24285 (tini) score 0 or sacrifice child
Apr 24 21:23:49 ip-172-20-4-124 kernel: Killed process 24308 (ruby2.3) total-vm:316404kB, anon-rss:163352kB, file-rss:8172kB, shmem-rss:0kB
richm commented 5 years ago

hmm - then maybe it has nothing to do with the lru cache - even a setting of 1000 (the default) should not cause an oom in this case. so, not sure what the problem is. Can you afford to increase your fluentd memory? if so, can you eventually increase the fluentd memory to the point where you do not get an oom? also, are you using jemalloc with fluentd?

janario commented 5 years ago

My guess is when it consumes the kube api to load the pods

But first of all, my scenario of 5k pods only happened by a mistake in dev environment and a cron stated too many pods in 'Pending' this won't happen on a my real scenario

I opened the issue as an improvements to try to keep the memory usage as low as possible. We have noticed that fluentd in general consumes too much memory and we are trying to understand which parts of it and why

Improvements at fluentd-kubernetes-metadata would be very welcome for the whole stack :-)

jcantrill commented 5 years ago

@janario I think you have to ask yourself what is reasonable for fluentd given what you are seeing. You are restricting the entire process to 300M. This memory needs to account for all that is used by: the ruby runtime, fluent's pipelines, in memory caches, processing, etc, etc. Additionally, this plugin adds caching of the labels (and annotations if configured) from every pod spec. I would imagine with a simple back of the napkin calculation you could easily justify the metadata cache alone eating all of the 300M. If you desire the metadata, then you will need budget for it in the collector.

Feel free to submit any PRs to improve the caching mechanism's memory usage

janario commented 5 years ago

I agree 300m maybe is not the best value

but when you think the cluster has a lot of pods, not all of them will run on the same worknode, so not all of them will be managed by the fluentd running on the same worknode (daemonset)

The problem I see here is that I have to increase fluentd memory according my whole cluster size, not my worknode pods

jcantrill commented 5 years ago

I agree 300m maybe is not the best value

The cache is LRU. You might consider actually doing the opposite of what @richm suggested and lowering the number of entries since you are trying to limit the memory usage. The consequences here, however, are that you will be placing more load on the API server.

but when you think the cluster has a lot of pods, not all of them will run on the same worknode, so not all of them will be managed by the fluentd running on the same worknode (daemonset)

Fluentd does not manage any of the pods; it is the runtime responsible for managent. Additionally, the runtime spreads pods across the cluster and fluent as noted is a daemonset which means it only needs to cache meta for the pods which are scheduled on its node, not for the entire cluster. The LRU cache will evict meta for pods which fall to the "bottom" of the cache. I believe lowering the max entry counter TTL may help with the issue you are seeing.

One other place which may warrent checking is the stats cache. It keeps some counts which in the grand scheme of the issue probably isn't much but I do not recall what measures are taken to evict that cache.

The problem I see here is that I have to increase fluentd memory according my whole cluster size, not my worknode pods

This is why the platform gives you the option to require (min) memory and restrict (or not) memory. What you are saying is exactly the reason you would not restrict memory; as the workload increases, allow the platform to give fluent more memory.

Let me reiterate that this feature comes at a price: memory. I fully understand it is undesireable to expect an infra component to eat up the precious memory that would otherwise be available to your other workloads. If you want this metadata you either need to:

janario commented 5 years ago

Sorry for the delay

Sorry if I could not be clear, what I meant is: Fluentd(filter) resources should not be increased as the cluster grow, but as the worknode supports it

I've done a PR, maybe with it keeps more clear https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/pull/177


It seems that get_pods(limit: 1) is not limiting, I would like to ask someones help with this :) (I'm not that familiar with ruby)

richm commented 5 years ago

I get it now - https://github.com/fabric8io/fluent-plugin-kubernetes_metadata_filter/pull/177 explains it

jcantrill commented 3 years ago

closing fixed by #189