Add out_kafka example - Githubissues

solsson commented 6 years ago

This is a rebase of #11. In addition:

Adds example memory limits.
Adds annotations for the new prometheus exporter feature.

StevenACoffman commented 6 years ago

@solsson I know this is merged, but I just applied this (changed topic and namespace, otherwise identical) and am getting this:

kangaroo/fluent-bit-rq45j[fluent-bit]: [2018/01/17 03:42:04] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-rq45j[fluent-bit]: [2018/01/17 03:42:04] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-dncm5[fluent-bit]: [2018/01/17 03:42:12] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-dncm5[fluent-bit]: [2018/01/17 03:42:12] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-lw88d[fluent-bit]: [2018/01/17 03:42:13] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-lw88d[fluent-bit]: [2018/01/17 03:42:13] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:49] [ info] [engine] started
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:49] [ info] [out_kafka] brokers='bootstrap.kafka:9092' topics='k8s-firehose'
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:49] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:49] [ info] [filter_kube] local POD info OK
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:49] [ info] [filter_kube] testing connectivity with API server...
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:50] [ info] [engine] started
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:50] [ info] [out_kafka] brokers='bootstrap.kafka:9092' topics='k8s-firehose'
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:50] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:50] [ info] [filter_kube] local POD info OK
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:50] [ info] [filter_kube] testing connectivity with API server...
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:51] [ info] [filter_kube] API server connectivity OK
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:43:51] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:55] [ info] [filter_kube] API server connectivity OK
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:55] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:56] [error] [sqldb] error=disk I/O error
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:56] [error] [sqldb] error=disk I/O error
kangaroo/fluent-bit-g2lxp[fluent-bit]: socket: No buffer space available
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:56] [error] [io] could not create socket
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:56] [error] [filter_kube] upstream connection error
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:57] [error] [sqldb] error=disk I/O error
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:43:57] [error] [plugins/in_tail/tail_file.c:742 errno=2] No such file or directory
kangaroo/fluent-bit-77jx4[fluent-bit]: [2018/01/17 03:47:08] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-77jx4[fluent-bit]: [2018/01/17 03:47:08] [error] [out_kafka] fluent-bit#producer-1: [thrd:bootstrap.kafka:9092/bootstrap]: bootstrap.kafka:9092/bootstrap: Receive failed: Disconnected
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:48:58] [ info] [engine] started
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:48:58] [ info] [out_kafka] brokers='bootstrap.kafka:9092' topics='k8s-firehose'
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:48:58] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:48:58] [ info] [filter_kube] local POD info OK
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:48:58] [ info] [filter_kube] testing connectivity with API server...
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:49:04] [ info] [filter_kube] API server connectivity OK
kangaroo/fluent-bit-npsz8[fluent-bit]: [2018/01/17 03:49:04] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:09] [ info] [engine] started
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:09] [ info] [out_kafka] brokers='bootstrap.kafka:9092' topics='k8s-firehose'
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:09] [ info] [filter_kube] https=1 host=kubernetes.default.svc port=443
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:09] [ info] [filter_kube] local POD info OK
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:09] [ info] [filter_kube] testing connectivity with API server...
kangaroo/fluent-bit-g2lxp[fluent-bit]: [2018/01/17 03:49:14] [ info] [filter_kube] API server connectivity OK

I'm going to investigate more in the morning.

solsson commented 6 years ago

@StevenACoffman Could it be the memory limits? You probably have significantly higher log volumes than I do. "No buffer space available" could indicate that.

StevenACoffman commented 6 years ago

That was certainly one problem. I removed the limits, but am still seeing some failures, and the pods never are restarted successfully, as in #17

solsson commented 6 years ago

I get only the same kafka disconnects (~5 per hour per pod) as with fluent-bit-kafka-dev:0.5. Running the limits from the minikube example in QA, max 10Mi, and got one OOMKilled for 6 pods over 19h. With the old image I suspect I had memory use increasing slowly over time, without any increase in log volumes, so I'm trying to spot that with tight limits.

solsson commented 6 years ago

@StevenACoffman What highs do you see in memory use with your load? I'll create a PR that increases the limits, but not too far because I want to see if buffers grow (https://github.com/fluent/fluent-bit-kubernetes-logging/pull/11#issuecomment-352087434) or there's memory leaks.

StevenACoffman commented 6 years ago

These are currently the highest metrics I have in one randomly selected cluster:

container_memory_cache{container_name="POD",id="/kubepods/burstable/pod2efd65ad-fd2f-11e7-a785-0a1463037f40/124f69c80bd7311e43a9c4a1aaa3f03f500b5700eb6aec2b7ec790f836d0a101",image="gcr.io/google_containers/pause-amd64:3.0",name="k8s_POD_fluent-bit-kwxbs_kangaroo_2efd65ad-fd2f-11e7-a785-0a1463037f40_0",namespace="kangaroo",pod_name="fluent-bit-kwxbs"} 28672
container_memory_rss{container_name="POD",id="/kubepods/burstable/pod2efd65ad-fd2f-11e7-a785-0a1463037f40/124f69c80bd7311e43a9c4a1aaa3f03f500b5700eb6aec2b7ec790f836d0a101",image="gcr.io/google_containers/pause-amd64:3.0",name="k8s_POD_fluent-bit-kwxbs_kangaroo_2efd65ad-fd2f-11e7-a785-0a1463037f40_0",namespace="kangaroo",pod_name="fluent-bit-kwxbs"} 36864
container_memory_usage_bytes{container_name="POD",id="/kubepods/burstable/pod2efd65ad-fd2f-11e7-a785-0a1463037f40/124f69c80bd7311e43a9c4a1aaa3f03f500b5700eb6aec2b7ec790f836d0a101",image="gcr.io/google_containers/pause-amd64:3.0",name="k8s_POD_fluent-bit-kwxbs_kangaroo_2efd65ad-fd2f-11e7-a785-0a1463037f40_0",namespace="kangaroo",pod_name="fluent-bit-kwxbs"} 573440
container_network_receive_bytes_total{container_name="POD",id="/kubepods/burstable/pod2efd65ad-fd2f-11e7-a785-0a1463037f40/124f69c80bd7311e43a9c4a1aaa3f03f500b5700eb6aec2b7ec790f836d0a101",image="gcr.io/google_containers/pause-amd64:3.0",interface="eth0",name="k8s_POD_fluent-bit-kwxbs_kangaroo_2efd65ad-fd2f-11e7-a785-0a1463037f40_0",namespace="kangaroo",pod_name="fluent-bit-kwxbs"} 5.69402814e+08

I've been focussing on getting things to run reliably (both kafka and fluent-bit) before fine-tuning memory limits. Most of my fluent-bit pods fell over pretty quickly in a busy cluster with the limits you have. Are you scraping the new prometheus endpoint in your test clusters? I would expect that to add a bit of overhead.

The fluent-bit memory usage Documentation has:

So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.

Given the Mem_Buf_Limit 5MB, then we either need 13 MB, not counting the prometheus endpoint. I'm seeing usage higher than that.

edsiper commented 6 years ago

I think the problem is the lack of "check" in out_kafka: when we deliver data to kafka, we ingest the records into Kafka thread which has it own memory buffer, on that point we do ACK to Fluent Bit engine, instead we should wait for those records to be flushed before to ACK, otherwise Fluent Bit keeps ingesting records into out_kafka

StevenACoffman commented 6 years ago

cadvisor_-_k8s_fluent-bit_fluent-bit-kwxbs_kangaroo_2efd65ad-______kubepods_burstable_pod2efd65ad-fd2f-11e7-a785-0a1463037f40_ccac60a5fab90622c6e21136b846de733343aca5d3c2f358c0cd42eeb31eb121_

I looked in the cAdvisor UI for this container, which has been up for 3 days. I'm going to terminate it and look again.

StevenACoffman commented 6 years ago

After 30 minutes, it stayed happily at 416K. The other ones that are using a lot of memory are complaining in their logs:

[2018/01/22 19:10:50] [debug] [filter_kube] could not merge log as requested
[2018/01/22 19:10:50] [debug] [input tail.0] [mem buf] size = 5184238
[2018/01/22 19:10:50] [debug] [input] tail.0 paused (mem buf overlimit)

This is the cluster that I've been breaking kafka in interesting ways, so perhaps it is related to Eduardo's comment above.

StevenACoffman commented 6 years ago

What is weird is that many of the fluent-bit pods that have been up for 3+ days in that same cluster stay around 0.5 MB (e.g., 516K, 492K, 416K). When they go over memory limits, they go way over. Maybe the nodes where fluent-bit stays low are not generating as many log files?

solsson commented 6 years ago

Are you scraping the new prometheus endpoint in your test clusters?

Yes we do. I agree this should be accounted for in the recommended limits.

So, if we impose a limit of 10MB for the input plugins and considering the worse case scenario of the output plugin consuming 20MB extra, as a minimum we need (30MB x 1.2) = 36MB.

Given the Mem_Buf_Limit 5MB, then we either need 13 MB, not counting the prometheus endpoint. I'm seeing usage higher than that.

Then 20Mi could in theory be sufficient, with prometheus exports? Of course it's very annoying to have restarts of these pods, and it might affect testing negatively. A memory limit of twice the expected use isn't unreasonable, and over time metrics could tell if it can safely be reduced.

edsiper commented 6 years ago

hmm I think the missing part about resources usage depends on how much memory the output plugins also requires, definitely out kafka due to buffering may consume a lot

solsson commented 6 years ago

hmm I think the missing part about resources usage depends on how much memory the output plugins also requires, definitely out kafka due to buffering may consume a lot

It's interesting that @StevenACoffman runs with experimental kafka config, that could cause producer errors.

I guess anyone who likes to tweak limits can make a calculation based on average fluentbit_output_proc_bytes_total and how much kafka downtime they want to handle before dropping log messages.

I think the problem is the lack of "check" in out_kafka: when we deliver data to kafka, we ingest the records into Kafka thread which has it own memory buffer, on that point we do ACK to Fluent Bit engine, instead we should wait for those records to be flushed before to ACK, otherwise Fluent Bit keeps ingesting records into out_kafka

@edsiper In other words, as hitting the memory limit means pod restart, fluent-bit will not re-process from sources the messages that had gone into the buffer?

What's the effect with Retry_Limit false (or an actual value) here?

solsson commented 6 years ago

@edsiper I should set request.required.acks=1 librdkafka property in the example daemonset's config. It's unlikely that anyone would like it higher for logs, and it's a useful example for those who want to change it to 0. How do I do that?

edsiper commented 6 years ago

@solsson I've created https://github.com/fluent/fluent-bit/issues/495 to track the enhancement.

fluent / fluent-bit-kubernetes-logging

Add out_kafka example #16