Closed hustshawn closed 5 years ago
Could you provide your Fluentd docker log?
<match *.**>
The above settings is very dangerous. This blackhole pattern causes flood of declined log: https://github.com/uken/fluent-plugin-elasticsearch#declined-logs-are-resubmitted-forever-why
Hi @cosmo0920 , The fluentd logs are looks like below
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: 'flush_interval' is configured at out side of <buffer>. 'flush_mode' is set to 'interval' to keep existing behaviour
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: Detected ES 6.x: ES 7.x will only accept `_doc` in type_name.
fluentd_1 | 2019-01-09 03:15:52 +0000 [warn]: To prevent events traffic jam, you should specify 2 or more 'flush_thread_count'.
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: using configuration file: <ROOT>
fluentd_1 | <source>
fluentd_1 | @type forward
fluentd_1 | port 24224
fluentd_1 | bind "0.0.0.0"
fluentd_1 | </source>
fluentd_1 | <match *.**>
fluentd_1 | @type copy
fluentd_1 | <store>
fluentd_1 | @type "elasticsearch"
fluentd_1 | host my-es-host
fluentd_1 | port 9200
fluentd_1 | logstash_format true
fluentd_1 | logstash_prefix "fluentd"
fluentd_1 | logstash_dateformat "%Y%m%d"
fluentd_1 | include_tag_key true
fluentd_1 | type_name "access_log"
fluentd_1 | tag_key "@log_name"
fluentd_1 | flush_interval 1s
fluentd_1 | <buffer>
fluentd_1 | flush_interval 1s
fluentd_1 | </buffer>
fluentd_1 | </store>
fluentd_1 | <store>
fluentd_1 | @type "stdout"
fluentd_1 | </store>
fluentd_1 | </match>
fluentd_1 | </ROOT>
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: starting fluentd-1.3.2 pid=5 ruby="2.5.2"
fluentd_1 | 2019-01-09 03:15:52 +0000 [info]: spawn command to main: cmdline=["/usr/bin/ruby", "-Eascii-8bit:ascii-8bit", "/usr/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--under-supervisor"]
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '3.0.1'
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: gem 'fluentd' version '1.3.2'
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: adding match pattern="*.**" type="copy"
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: #0 'flush_interval' is configured at out side of <buffer>. 'flush_mode' is set to 'interval' to keep existing behaviour
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: #0 Detected ES 6.x: ES 7.x will only accept `_doc` in type_name.
fluentd_1 | 2019-01-09 03:15:53 +0000 [warn]: #0 To prevent events traffic jam, you should specify 2 or more 'flush_thread_count'.
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: adding source type="forward"
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: #0 starting fluentd worker pid=13 ppid=5 worker=0
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: #0 listening port port=24224 bind="0.0.0.0"
fluentd_1 | 2019-01-09 03:15:53 +0000 [info]: #0 fluentd worker is now running worker=0
fluentd_1 | 2019-01-09 03:15:53.601732394 +0000 fluent.info: {"worker":0,"message":"fluentd worker is now running worker=0"}
....
Umm..., could you share fluentd error log between from 2019-01-10 2:00 to 2019-01-10 11:00 ?
Shared log is booting log. It just says that Fluentd was launched normally.
@cosmo0920 I find something like this
[fluentd_1 |[0m 2019-01-10 02:16:45 +0000 [warn]: #0 failed to flush the buffer. retry_time=15 next_retry_seconds=2019-01-10 07:21:51 +0000 chunk="57f0d689aeefe7b1ef1da592fed4d444" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"my-es-host\", :port=>9200, :scheme=>\"http\"}): Connection refused - connect(2) for 172.18.0.2:9200 (Errno::ECONNREFUSED)"
[fluentd_1 |[0m 2019-01-10 02:16:45 +0000 [warn]: #0 suppressed same stacktrace
[fluentd_1 |[0m 2019-01-10 02:16:45.424613201 +0000 fluent.warn: {"retry_time":15,"next_retry_seconds":"2019-01-10 07:21:51 +0000","chunk":"57f0d689aeefe7b1ef1da592fed4d444","error":"#<Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure: could not push logs to Elasticsearch cluster ({:host=>\"my-es-host\", :port=>9200, :scheme=>\"http\"}): Connection refused - connect(2) for 172.18.0.2:9200 (Errno::ECONNREFUSED)>","message":"failed to flush the buffer. retry_time=15 next_retry_seconds=2019-01-10 07:21:51 +0000 chunk=\"57f0d689aeefe7b1ef1da592fed4d444\" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error=\"could not push logs to Elasticsearch cluster ({:host=>\\\"my-es-host\\\", :port=>9200, :scheme=>\\\"http\\\"}): Connection refused - connect(2) for 172.18.0.2:9200 (Errno::ECONNREFUSED)\""}
It seems that ES plugin cannot push events due to ECONNREFUSED
.
This error is from network stack.
Could you check docker networking settings or ES side log?
@cosmo0920 My ES is setup with AWS EC2, and the networking should be fine, without disconnect or DNS issue. I also find some extra logs just above previous logs.
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-3.0.1/lib/fluent/plugin/out_elasticsearch.rb:645:in `rescue in send_bulk'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-3.0.1/lib/fluent/plugin/out_elasticsearch.rb:627:in `send_bulk'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-3.0.1/lib/fluent/plugin/out_elasticsearch.rb:534:in `block in write'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-3.0.1/lib/fluent/plugin/out_elasticsearch.rb:533:in `each'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluent-plugin-elasticsearch-3.0.1/lib/fluent/plugin/out_elasticsearch.rb:533:in `write'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:1123:in `try_flush'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:1423:in `flush_thread_run'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:452:in `block (2 levels) in start'
^[[36mfluentd_1 |^[[0m 2019-01-09 21:47:30 +0000 [warn]: #0 /usr/lib/ruby/gems/2.5.0/gems/fluentd-1.3.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
@cosmo0920 Here is more logs from ES
elasticsearch_1 | [2019-01-10T04:41:01,689][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,689][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,795][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,795][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,823][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,823][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,833][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,833][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,835][INFO ][o.e.c.r.a.AllocationService] [-utwWeF] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[fluentd-20190108][2]] ...]).
elasticsearch_1 | [2019-01-10T04:41:01,843][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:01,847][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:08,712][INFO ][o.e.c.m.MetaDataMappingService] [-utwWeF] [fluentd-20190110/j4oWJJa8Rla-l48sMgHLog] update_mapping [access_log]
elasticsearch_1 | [2019-01-10T04:41:08,724][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T04:41:08,724][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T06:18:09,832][INFO ][o.e.c.m.MetaDataMappingService] [-utwWeF] [fluentd-20190110/j4oWJJa8Rla-l48sMgHLog] update_mapping [access_log]
elasticsearch_1 | [2019-01-10T06:18:09,843][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T06:18:09,843][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T06:18:09,859][INFO ][o.e.c.m.MetaDataMappingService] [-utwWeF] [fluentd-20190110/j4oWJJa8Rla-l48sMgHLog] update_mapping [access_log]
elasticsearch_1 | [2019-01-10T06:18:09,867][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[fluentd-20190109/JvyIBQfkQZGjNEXy0you4A]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
elasticsearch_1 | [2019-01-10T06:18:09,868][WARN ][o.e.g.DanglingIndicesState] [-utwWeF] [[.kibana_1/1rFuKeKfRDel1FPUWShc4w]] can not be imported as a dangling index, as index with same name already exists in cluster metadata
and actually, I have two nodes/host with same configuration that collect logs from my application server, do you think that should be a concern for this issue?
If it is true, is there any other way in the fluentd configuration to distinguish the logs collected from which node? like hostname or host ip as the metadata?
do you think that should be a concern for this issue?
It should check Docker networking. Bare metal environment might not cause networking issue. Here is the another case due to docker networking: https://github.com/uken/fluent-plugin-elasticsearch/issues/416
The above issue is also only occurred within docker not bare metal environment.
If it is true, is there any other way in the fluentd configuration to distinguish the logs collected from which node? like hostname or host ip as the metadata?
in_forward
has the option adds hostname:
https://docs.fluentd.org/v1.0/articles/in_forward#source_hostname_key
@cosmo0920 Thanks for you for your advice. But I have to use the fluentd in docker, and it looks like the issue still there. The services in my docker is always running well. It probably not the docker networking issue.
Met similar issue, but I have the fluend deployed as a daemonset under kube-system
namespace.
And I can confirm ES is running well all the time, since fluentd is only one of my logging sources, and other sources can work well and showing logs correctly in ES.
@emmayang Same issue on my kube
platform.
Hmmm..., could you try typhoeus
backend instead of excon?
typhoeus
can handle keep-alive
by default.
https://github.com/uken/fluent-plugin-elasticsearch#http_backend
I'm also seeing this same issue when running fluentd with ES plugin in Kubernetes. I tried both backends and typhoeus didn't work at all, while the default backend would work on initial connection (fresh deploy) and then stop sending data almost immediately.
EDIT: I believe my issues were not from the ES plugin but performance tuning that I needed to do on Fluentd.
I have similar problems.I also have huge number for warnings as below:
"failed to flush the buffer. retry_time=0 next_retry_seconds=2019-03-19 01:30:36 +0000 chunk="584686c3d47849db61228ea7e6f29bb5" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"es-cn-v0h10rbfl000kfon8..com\", :port=>9200, :scheme=>\"http\", :user=>\"elastic\", :password=>\"obfuscated\"}): connect_write timeout reached""
when this error happens ,the only way is to restart the fluentd container.but then log gap happens.
Same problem here. I'm using fluentd-kubernetes-daemonset. Already opened an Issue here https://github.com/fluent/fluentd-kubernetes-daemonset/issues/280 After deployment the plugin works fine and ships all logs to ES. But after a few hours the plugin stops with following error:
2019-03-19 08:24:32 +0000 : #0 [out_es] failed to flush the buffer. retry_time=2810 next_retry_seconds=2019-03-19 08:25:05 +0000 chunk="5846b2b0d6d06c398eee3540256d465d" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableReque │
│ stFailure error="could not push logs to Elasticsearch cluster ({:host=>\"elastic.xyz.com\", :port=>443, :scheme=>\"https\", :user=>\"elastic\", :password=>\"obfuscated\", :path=>\"\"}): connect_write timeout reached"
Only solution is to restart the pod. But this isnt' an acceptable solution,
Set reload_connections
as false
can help this issue?
I launched docker-compose environment with https://github.com/fluent/fluentd/issues/2334#issue-422196534 settings but I didn't reproduce in my local environment.
To reproduce this issue, we should handle a massive events?
Set
reload_connections
asfalse
can help this issue? I launched docker-compose environment with fluent/fluentd#2334 (comment) settings but I didn't reproduce in my local environment. To reproduce this issue, we should handle a massive events?
@cosmo0920, I'm afraid so...In my case, the hits reach 100000+ then the issue happens.
In fluentd, here's error info:
2019-03-20 02:07:53 +0000 [warn]: temporarily failed to flush the buffer. next_retry=2019-03-20 02:07:54 +0000 error_class="Elasticsearch::Transport::Transport::Error" error="Cannot get new connection from pool." plugin_id="object:3f880ef7f118"
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-1.0.18/lib/elasticsearch/transport/transport/base.rb:249:in perform_request' 2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-1.0.18/lib/elasticsearch/transport/transport/http/faraday.rb:20:in perform_request'
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-transport-1.0.18/lib/elasticsearch/transport/client.rb:128:in perform_request' 2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/elasticsearch-api-1.0.18/lib/elasticsearch/api/actions/bulk.rb:90:in bulk'
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.2/lib/fluent/plugin/out_elasticsearch.rb:353:in send_bulk' 2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-elasticsearch-1.9.2/lib/fluent/plugin/out_elasticsearch.rb:339:in write_objects'
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.43/lib/fluent/output.rb:490:in write' 2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.43/lib/fluent/buffer.rb:354:in write_chunk'
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.43/lib/fluent/buffer.rb:333:in pop' 2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.43/lib/fluent/output.rb:342:in try_flush'
2019-03-20 02:07:53 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-0.12.43/lib/fluent/output.rb:149:in `run'
I'll try 'reconnect_on_error true' and give feedback.
Set
reload_connections
asfalse
can help this issue? I launched docker-compose environment with fluent/fluentd#2334 (comment) settings but I didn't reproduce in my local environment. To reproduce this issue, we should handle a massive events?
Maybe this is the solution for me. Set reload_connection to false, now it's working for about 18h without troubles. I will monitor it for the next few hours / days.
@bidiudiu @ChSch3000 Thank you for your issue confirmations and clarifications!
fluentd-kubernates-daemonset provides the following environment variable:
FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS
(default: true)This should be specified:
FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS=false
I've added FAQ for this situation. https://github.com/uken/fluent-plugin-elasticsearch/pull/564
Any lack of information to solve this issue?
Thanks @cosmo0920. I add settings below and it works fine:
reconnect_on_error true
reload_on_failure true
reload_connections false
reconnect_on_error true reload_on_failure true reload_connections false
OK. Thanks for confirming, @bidiudiu ! I'll add more descriptions for this issue into FAQ.
Can we change the default value of those settings for fluentd-kubernetes-daemonset ? I think everyone who uses fluentd-kubernetes-daemonset will encounter this issue easily ?
@dogzzdogzz if you are using helm to install, eg. helm upgrade --install logging-fluentd -f your-values.yml kiwigrid/fluentd-elasticsearch --namespace your-namespace
, you can just modify the fluentd config in your-values.yml
.
Part of my snippet looks like this,
output.conf: |
# Enriches records with Kubernetes metadata
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
<match **>
@id elasticsearch
@type elasticsearch
@log_level info
include_tag_key true
type_name _doc
host "#{ENV['OUTPUT_HOST']}"
port "#{ENV['OUTPUT_PORT']}"
scheme "#{ENV['OUTPUT_SCHEME']}"
ssl_version "#{ENV['OUTPUT_SSL_VERSION']}"
logstash_format true
logstash_prefix "#{ENV['LOGSTASH_PREFIX']}"
reload_connections false
reconnect_on_error true
reload_on_failure true
slow_flush_log_threshold 25.0
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
flush_interval 5s
flush_thread_count 4
chunk_full_threshold 0.9
# retry_forever
retry_type exponential_backoff
retry_timeout 1m
retry_max_interval 30
chunk_limit_size "#{ENV['OUTPUT_BUFFER_CHUNK_LIMIT']}"
queue_limit_length "#{ENV['OUTPUT_BUFFER_QUEUE_LIMIT']}"
overflow_action drop_oldest_chunk
</buffer>
</match>
@dogzzdogzz The latest fluentd-kubernetes-daemonset includes the above settings by default.
Tried using the exact same config as https://github.com/uken/fluent-plugin-elasticsearch/issues/525#issuecomment-490724317 but the issue still persists. Fluentd stops shipping logs to Elasticsearch after some time.
@cosmo0920 same issue persists , unable to send logs after few times , As per the observation , flunetd run absolutely fine till no restart , when pod get restarted problem occurs
2020-08-05 09:58:12 +0000 [warn]: [sample-service] failed to flush the buffer. retry_time=2 next_retry_seconds=2020-08-05 09:58:14 +0000 chunk="5ac1e67bde2f323981d71058390e5ebe" error_class=Fluent::Plugin::ElasticsearchOutput::RecoverableRequestFailure error="could not push logs to Elasticsearch cluster ({:host=>\"192.168.0.15\", :port=>9500, :scheme=>\"http\", :user=>\"fluentd\", :password=>\"obfuscated\"}, {:host=>\"192.168.0.16\", :port=>9500, :scheme=>\"http\", :user=>\"fluentd\", :password=>\"obfuscated\"}): read timeout reached"
**Resoultion :**
the only solution i found that forcefully restart fluend pod , hence new container send logs immediately
You should add simple sniffer loading code and specify loaded simple sniffer class: https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name Default sniffer class causes this issue.
You should add simple sniffer loading code and specify loaded simple sniffer class: https://github.com/uken/fluent-plugin-elasticsearch#sniffer-class-name Default sniffer class causes this issue.
did this work , to solve the "failed to flush the buffer" error , if so could you post the configuration , i have tried running fluentd with sniffer class , but i still get the same error
Thanks,
Yes me too, I've loaded the sniffer class and it's still giving me that error. I'm using version 4.0.5, but I'm getting that error as soon as the fluentd pods restart, there's no grace period where this is succeeding at sending logs. Initially it was working though, and the scheme is set to https I double checked and it was actually sending successfully on restart.
Same issue here. Did anyone found a concrete solution? I tried these, but no luck.
reconnect_on_error true
reload_on_failure true
reload_connections false
Also the sniffer_class solution doesn't work for me at all and throws an error.
So I found the solution 4days ago and I've been testing it ever since. So after the change I made, my fluentd didn't stopped or crashed sending logs to elasticsearch.
My solution was to change the buffer path in a way I saw in fluentd documentation.
path /opt/bitnami/fluentd/logs/buffers/logs.*.buffer
instead of
path /opt/bitnami/fluentd/logs/buffers/logs.buffer
This worked for me.
So I found the solution 4days ago and I've been testing it ever since. So after the change I made, my fluentd didn't stopped or crashed sending logs to elasticsearch.
My solution was to change the buffer path in a way I saw in fluentd documentation.
path /opt/bitnami/fluentd/logs/buffers/logs.*.buffer
instead of
path /opt/bitnami/fluentd/logs/buffers/logs.buffer
This worked for me.
@mokhos , please could you let us know the version of fluentd / fluentd-plugin-elasticsearch , you were using to test this configuration ?
So I found the solution 4days ago and I've been testing it ever since. So after the change I made, my fluentd didn't stopped or crashed sending logs to elasticsearch. My solution was to change the buffer path in a way I saw in fluentd documentation.
path /opt/bitnami/fluentd/logs/buffers/logs.*.buffer
instead ofpath /opt/bitnami/fluentd/logs/buffers/logs.buffer
This worked for me.@mokhos , please could you let us know the version of fluentd / fluentd-plugin-elasticsearch , you were using to test this configuration ?
I have used below versions:
2022-03-30 11:56:59 +0000 [info]: gem 'fluentd' version '1.14.5'
2022-03-30 11:56:59 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '5.1.5'
Hi @cosmo0920, I am also facing the same issue , it will be helpful if you share your solution with me
If i restarted my td-agent.service the logs are coming for sometime in the elasticsearch after 3-6 mins they are getting stopped automatically and no error is showing in td-agent logs.
Here is my configuration:
<match "mytopicname">
@type elasticsearch
hosts "my_IP_address_here"
ca_file "my_path_here"
client_cert "my_path_here"
client_key " my_path_here"
ssl_verify true
user "my_username"
password "my_password"
logstash_format true
logstash_prefix "my_index_name"
logstash_date_format my_date_format
time_key_format "my time format"
type_name fluentd
log_es_400_reason true
include_timestamp true
reconnect_on_error true
reload_on_failure true
reload_connections false
<buffer>
@type file
path "my path here"
chunk_limit_size 10m
</buffer>
</match>
also tried.
<match "mytopicname">
@type elasticsearch
hosts "my_IP_address_here"
ca_file "my_path_here"
client_cert "my_path_here"
client_key " my_path_here"
ssl_verify true
user "my_username"
password "my_password"
logstash_format true
logstash_prefix "my_index_name"
logstash_date_format my_date_format
time_key_format "my time format"
type_name fluentd
log_es_400_reason true
include_timestamp true
reconnect_on_error true
reload_on_failure true
reload_connections false
slow_flush_log_threshold 25.0
<buffer>
@type file
path "syslog.*.buffer"
chunk_limit_size 50m
flush_mode interval
flush_interval 5s
flush_thread_count 4
overflow_action drop_oldest_chunk
retry_timeout 1m
retry_max_interval 30
chunk_full_threshold 0.9
</buffer>
</match>
Please help !!!!!
Note : The above configuration is not copy pasted ignore indentation
Thanks @cosmo0920. I add settings below and it works fine:
reconnect_on_error true reload_on_failure true reload_connections false
It works for me
Problem
I used the fluentd with your plugin to collect logs from docker containers and send to ES. It works at the very begining. But later, the ES unable to recieve the logs from fluentd. The ES is always running fine. And I find there is no indices of the new day(eg.
fluentd-20190110
, only the old indice20190109
exist) in the ES.However, if I restart my docker containers with fluentd, it can start sending logs to ES.
...
Steps to replicate
The fluentd config
Expected Behavior or What you need to ask
The fluentd should keep sending logs to ES.
Using Fluentd and ES plugin versions
fluentd --version
ortd-agent --version
v1.3.2-1.0
fluent-gem list
,td-agent-gem list
or your Gemfile.lock