fluent / fluentd

Fluentd: Unified Logging Layer (project under CNCF)
https://www.fluentd.org
Apache License 2.0
12.81k stars 1.34k forks source link

Fluentd process stop consuming messages from kafka #1470

Closed sharadgaur closed 7 years ago

sharadgaur commented 7 years ago

Fluentd Process stop consuming messages from kakfa after 15-30 mins of runtime. I also noticed when trying to stop td-agent gracefull I am timeout error.

~>service td-agent stop Stopping td-agent: Timeout error occurred trying to stop td[FAILED].td-agent

Process still shows td-agent is running ~>service td-agent status td-agent is running [ OK ]

############ I am using td-agent version 0.14.11 ~>td-agent --version td-agent 0.14.11** ############

######################## Here is the Environment information:

~> td-agent-gem env RubyGems Environment:

######################## Here is the list of gems :

~> td-agent-gem list --local

LOCAL GEMS

addressable (2.5.0) aws-sdk (2.6.46) aws-sdk-core (2.6.46) aws-sdk-resources (2.6.46) aws-sigv4 (1.0.0) bigdecimal (default: 1.3.0) bundler (1.13.3) bzip2-ffi (1.0.0) cool.io (1.4.5) did_you_mean (1.1.0) elasticsearch (5.0.0, 1.0.18) elasticsearch-api (5.0.0, 1.0.18) elasticsearch-transport (5.0.0, 1.0.18) excon (0.54.0) faraday (0.11.0) ffi (1.9.17) fluent-logger (0.6.2) fluent-mixin-config-placeholders (0.4.0) fluent-mixin-plaintextformatter (0.2.6) fluent-mixin-rewrite-tag-name (0.1.0) fluent-plugin-dstat (0.3.3) fluent-plugin-elasticsearch (1.9.2) fluent-plugin-forest (0.3.3) fluent-plugin-jvmwatcher (0.1.5) fluent-plugin-kafka (0.5.3) fluent-plugin-rewrite-tag-filter (1.5.5) fluent-plugin-s3 (1.0.0.rc1) fluent-plugin-td (0.10.29) fluent-plugin-td-monitoring (0.2.2) fluent-plugin-top (0.1.1) fluent-plugin-webhdfs (1.1.0, 0.4.2) fluentd (0.14.11) hirb (0.7.3) http_parser.rb (0.6.0) httpclient (2.8.2.4) io-console (default: 0.4.6) ipaddress (0.8.3) jmespath (1.3.1) json (default: 2.0.2) ltsv (0.1.0) mini_portile2 (2.1.0) minitest (5.10.1) mixlib-cli (1.7.0) mixlib-config (2.2.4) mixlib-log (1.7.1) mixlib-shellout (2.2.7) msgpack (1.0.2) multi_json (1.12.1) multipart-post (2.0.0) net-telnet (0.1.1) nokogiri (1.7.0.1) ohai (6.20.0) oj (2.18.0) openssl (default: 2.0.2) parallel (1.8.0) power_assert (0.4.1) psych (default: 2.2.2) public_suffix (2.0.5) rake (12.0.0) rdoc (default: 5.0.0) ruby-kafka (0.3.16) ruby-progressbar (1.8.1) rubyzip (1.1.7) serverengine (2.0.4) sigdump (0.2.4) strptime (0.1.9) systemu (2.5.2) td (0.15.2) td-client (0.8.85) td-logger (0.3.26) test-unit (3.2.3) thread_safe (0.3.5) tzinfo (1.2.2) tzinfo-data (1.2016.10) uuidtools (2.1.5) webhdfs (0.8.0) xmlrpc (0.2.1) yajl-ruby (1.3.0) zip-zip (0.3) ############

############ OS info: LSB Version: :base-4.0-amd64:base-4.0-noarch:core-4.0-amd64:core-4.0-noarch Distributor ID: OracleServer Description: Oracle Linux Server release 6.8 Release: 6.8 Codename: n/a ############

############ Here is the td-agent.conf

@type kafka_group consumer_group fluentd_v1 brokers somehost:9092,somehost2:9092 topics topic_name_cl format json start_from_beginning false offset_commit_interval 5s

@type rewrite_tag_filter @log_level debug rewriterule1 info (.*) hdfs.$1 rewriterule2 Info (.*) hdfs.$1

<match hdfs.> @type webhdfs namenode somehost3:50070 standby_namenode somehost5:50070 path /data/${tag[1]}/%{uuid}.json.index username ocingestion retry_known_errors yes retry_interval 60 <buffer time,tag> @type file
path /fluentd/data.
.log flush_interval 5s timekey 5m timekey_wait 10m

@type json

<match **> @type stdout

############

I am producing 60K+ JSON messages per 30 sec in kafka.

Here are the logs td-agent.log.gz

sharadgaur commented 7 years ago

@repeatedly

repeatedly commented 7 years ago

It seems the plugin issue. Please file an issue on plugin repository.

sharadgaur commented 7 years ago

https://github.com/fluent/fluent-plugin-kafka/issues/113

Thank you