fluent-plugins-nursery / fluent-plugin-cloudwatch-logs

CloudWatch Logs Plugin for Fluentd
MIT License
201 stars 141 forks source link

error_class=Yajl::EncodeError error="'Infinity' is an invalid number" #249

Open ajardan opened 1 year ago

ajardan commented 1 year ago

Problem

After a while td-agent produces this error in the logs, and after several retries, it stops delivering logs

This error is logged

2022-08-23 08:48:20 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2022-08-23 08:48:21 +0000 chunk="5e6e4a0022a7ff23c5b06427ec61c131" error_class=Yajl::EncodeError error="'Infinity' is an invalid number"
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/yajl-ruby-1.4.1/lib/yajl.rb:80:in `encode'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/yajl-ruby-1.4.1/lib/yajl.rb:80:in `encode'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/yajl-ruby-1.4.1/lib/yajl.rb:23:in `dump'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:108:in `block in configure'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:299:in `block (2 levels) in write'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:285:in `each'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:285:in `block in write'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:235:in `each'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluent-plugin-cloudwatch-logs-0.14.3/lib/fluent/plugin/out_cloudwatch_logs.rb:235:in `write'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.15.0/lib/fluent/plugin/output.rb:1180:in `try_flush'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.15.0/lib/fluent/plugin/output.rb:1501:in `flush_thread_run'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.15.0/lib/fluent/plugin/output.rb:501:in `block (2 levels) in start'
  2022-08-23 08:48:20 +0000 [warn]: #0 /opt/td-agent/embedded/lib/ruby/gems/2.4.0/gems/fluentd-1.15.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'

Steps to replicate

Configure a destination to CloudWatch logs, and send an event containing an invalid JSON with a value set to Infinity

Example event

{"log":"{\"details\":{\"instana.trace.id\":Infinity},\"full_message\":\"Failed to find access token\\n\",\"host\":\"d1653032c3d4\",\"level\":6,\"level_txt\":\"info\",\"logger_name\":\"org.springframework.security.oauth2.provider.token.store.JdbcTokenStore\",\"message\":\"Failed to find access token\",\"thread_name\":\"https-jsse-nio-8443-exec-321\",\"timestamp\":1661346253.396,\"version\":\"1.0\"}\n","stream":"stdout","time":"2022-08-24T13:04:13.396516342Z"}

Expected Behavior or What you need to ask

Logs being delivered

Using Fluentd and CloudWatchLogs plugin versions

addressable (2.7.0) async (1.24.2) async-http (0.50.8) async-io (1.27.7) async-pool (0.2.0) aws-eventstream (1.1.0) aws-partitions (1.621.0, 1.399.0) aws-sdk-cloudwatchlogs (1.53.0) aws-sdk-core (3.134.0, 3.109.3) aws-sdk-kms (1.39.0) aws-sdk-s3 (1.85.0) aws-sdk-sqs (1.34.0) aws-sigv4 (1.2.2) bigdecimal (default: 1.3.2) bundler (1.16.6) concurrent-ruby (1.1.7) console (1.8.2) cool.io (1.6.0) did_you_mean (1.1.0) digest-crc (0.6.1) elasticsearch (6.8.2) elasticsearch-api (6.8.2) elasticsearch-transport (6.8.2) excon (0.78.0) faraday (1.1.0) ffi (1.13.1) fileutils (1.4.1) fluent-config-regexp-type (1.0.0) fluent-diagtool (1.0.1) fluent-logger (0.9.0) fluent-plugin-cloudwatch-logs (0.14.3) fluent-plugin-concat (2.5.0) fluent-plugin-elasticsearch (4.2.2) fluent-plugin-flowcounter-simple (0.1.0) fluent-plugin-gelf-hs (1.0.8) fluent-plugin-kafka (0.15.2) fluent-plugin-prometheus (1.8.5) fluent-plugin-prometheus_pushgateway (0.0.2) fluent-plugin-record-modifier (2.1.0) fluent-plugin-rewrite-tag-filter (2.3.0) fluent-plugin-s3 (1.4.0) fluent-plugin-sd-dns (0.1.0) fluent-plugin-systemd (1.0.2) fluent-plugin-td (1.1.0) fluent-plugin-td-monitoring (0.2.4) fluent-plugin-webhdfs (1.3.1) fluentd (1.11.5) gelf (3.1.0) hirb (0.7.3) http_parser.rb (0.6.0) httpclient (2.8.2.4) io-console (default: 0.4.6) ipaddress (0.8.3) jmespath (1.6.1, 1.4.0) json (default: 2.0.4) ltsv (0.1.2) mini_portile2 (2.4.0) minitest (5.10.1) mixlib-cli (1.7.0) mixlib-config (2.2.4) mixlib-log (1.7.1) mixlib-shellout (2.2.7) msgpack (1.3.3) multi_json (1.15.0) multipart-post (2.1.1) net-telnet (0.1.1) nio4r (2.5.4) nokogiri (1.10.10) ohai (6.20.0) oj (3.8.1) openssl (default: 2.0.9) parallel (1.19.2) power_assert (0.4.1) prometheus-client (0.9.0) protocol-hpack (1.4.2) protocol-http (0.15.1) protocol-http1 (0.10.3) protocol-http2 (0.11.6) psych (default: 2.2.2) public_suffix (4.0.6) quantile (0.2.1) rake (13.0.1, 12.0.0) rdkafka (0.8.0) rdoc (default: 5.0.1) ruby-kafka (1.3.0) ruby-progressbar (1.10.1) ruby2_keywords (0.0.2) rubyzip (1.3.0) serverengine (2.2.2) sigdump (0.2.4) strptime (0.2.5) systemd-journal (1.3.3) systemu (2.5.2) td (0.16.9) td-client (1.0.7) td-logger (0.3.27) test-unit (3.2.3) timers (4.3.2) tzinfo (2.0.3) tzinfo-data (1.2020.4) webhdfs (0.9.0) xmlrpc (0.2.1) yajl-ruby (1.4.1) zip-zip (0.3)

ajardan commented 1 year ago

I opened before an issue in https://github.com/fluent/fluentd/issues/3870#issue-1347601296 but seems like it is more related to this plugin