fluent-plugins-nursery / fluent-plugin-concat

Fluentd Filter plugin to concatenate multiline log separated in multiple events.
MIT License
107 stars 34 forks source link

I need a final recipe from [warn]: dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError error="Timeout flush: kubernetes.var.log #83

Open DmitriyProkhorov opened 4 years ago

DmitriyProkhorov commented 4 years ago

#### Problem I am setting up an optional fluentd filter that uses the concat plugin. After adding a new filter, I got a lot of errors. I see that the concat cannot process many messages and I began to lose logs

2019-11-25 17:25:58 +0000 [warn]: dump an error event: error_class=Fluent::Plugin::ConcatFilter::TimeoutError error="Timeout flush: kubernetes.var.log.containers.core-deployment-prod-8459fd75c7-x4vq2_core-prod_core-prod-279427d134fe033554565456345354564895667830d6.log:" location=nil tag="kubernetes.var.log.containers.core-deployment-prod-8459fd75c7-x4vq2_core-prod_core-prod-279427d134fe033554565456345354564895667830d6.log" time=2019-11-25 17:25:58.009520360 +0000 record={"log"=>"2019-11-25 17:25:47 [WRN] QuestionSalePointService: BatchCreateOrUpdateAsync: finish Memory usage:379.089324951172 <s:>\n", "stream"=>"stdout"}

I found workarounds on the net, but they do not help me: https://github.com/fluent-plugins-nursery/fluent-plugin-concat/issues/37 https://github.com/fluent/fluentd/issues/2587 https://github.com/fluent-plugins-nursery/fluent-plugin-concat/issues/4 https://stackoverflow.com/questions/37159521/flush-timeouterror-in-fluentd

Steps to replicate

The part of the config that is responsible for this filter is:

<filter kubernetes.var.log.containers.core-deployment-**>
  @type concat
  key log
  stream_identity_key tag
  multiline_start_regexp /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<level>[^\]\\]+)\] (?<message>.*)/
  flush_interval 10s
</filter>

Expected Behavior

After adding an additional filter to the original fluentd config https://github.com/kubernetes/kubernetes/blob/master/cluster/addons/fluentd-elasticsearch/fluentd-es-configmap.yaml, I start to lose logs with an error

Your environment

K8S 1.13.5

activesupport (5.2.3) addressable (2.6.0) bigdecimal (1.2.8) concurrent-ruby (1.1.5) cool.io (1.5.4) did_you_mean (1.0.0) dig_rb (1.0.1) domain_name (0.5.20190701) elasticsearch (7.3.0) elasticsearch-api (7.3.0) elasticsearch-transport (7.3.0) excon (0.65.0) faraday (0.15.4) ffi (1.11.1) fluent-plugin-concat (2.4.0) fluent-plugin-detect-exceptions (0.0.12) fluent-plugin-elasticsearch (3.5.4) fluent-plugin-kubernetes_metadata_filter (2.2.0) fluent-plugin-multi-format-parser (1.0.0) fluent-plugin-prometheus (1.4.0) fluent-plugin-systemd (1.0.2) fluentd (1.6.3) http (0.9.8) http-cookie (1.0.3) http-form_data (1.0.3) http_parser.rb (0.6.0) i18n (1.6.0) io-console (0.4.5) json (1.8.3) kubeclient (1.1.4) lru_redux (1.1.0) mime-types (3.2.2) mime-types-data (3.2019.0331) minitest (5.11.3, 5.9.0) msgpack (1.3.0) multi_json (1.13.1) multipart-post (2.1.1) net-telnet (0.1.1) netrc (0.11.0) oj (3.8.1) power_assert (0.2.7) prometheus-client (0.9.0) psych (2.1.0) public_suffix (3.1.1) quantile (0.2.1) rake (10.5.0) rdoc (4.2.1) recursive-open-struct (1.0.0) rest-client (2.0.2) serverengine (2.1.1) sigdump (0.2.4) strptime (0.2.3) systemd-journal (1.3.3) test-unit (3.1.7) thread_safe (0.3.6) tzinfo (1.2.5) tzinfo-data (1.2019.2) unf (0.1.4) unf_ext (0.0.7.6) yajl-ruby (1.4.1)

letmepew commented 4 years ago

I am also stuck in same issue. Multiline log parser is not working in K8S.

okkez commented 4 years ago

This is because you use only multiline_start_regexp. If you use only multiline_start_regexp, this plugin will wait for the next line matching multiline_start_regexp. Therefore you can use multiline_end_regexp or continuous_line_regexp to handle multiline logs perfectly. In other words, you know your multiline logs completely. Or, you can use timeout_label configuration to handle Fluent::Plugin::ConcatFilter::TimeoutError, if you don't know how to handle multiline logs using multiline_end_regexp or continuous_line_regexp.

Prakashreddy134 commented 4 years ago

Hi @okkez i am using the multiline_end_regexp still i see error. <filter tail.containers.var.log.containers.test.log> @type concat key log timeout_label @SPLUNK stream_identity_key stream multiline_start_regexp /^\d{4}-\d{2}-\d{2}/ multiline_end_regexp /\n$/ flush_interval 5s separator "" use_first_timestamp true

How can we fix this issue? im using splunk connect for kubernetes

halr9000 commented 4 years ago

@Prakashreddy134 if you haven't already, I suggest logging an issue over on https://github.com/splunk/splunk-connect-for-kubernetes