uken / fluent-plugin-elasticsearch

Apache License 2.0
892 stars 309 forks source link

A question about consolidated timestamps from different sources #980

Open bozho opened 2 years ago

bozho commented 2 years ago

(check apply)

Problem

Hi,

We are running a docker swarm and I'm working on implementing log shipping to ES+Kibana. Fluentd also runs as a service in the swarm. Individual docker services will be redirecting their logs to fluentd, each service has its own fluentd tag.

I will probably end up using logstash_format, but as an exercise, I'm trying to consolidate all timestamps from different logs into a single timestamp field to be used by Kibana as a time field on an index pattern.

I'm currently focusing on nginx logs. I've configured access logs to be in JSON format:

 log_format main escape=json '{'
    '"remote_addr": "$remote_addr",'
    '"remote_user": "$remote_user",'
    '"timestamp": "$time_iso8601",'
    '"request": "$request",'
    '"request_time": $request_time,'
    '"status": $status,'
    '"body_bytes_sent": $body_bytes_sent,'
    '"http_referer": "$http_referer",'
    '"http_user_agent": "$http_user_agent",'
    '"ssl_protocol": "$ssl_protocol",'
    '"ssl_cipher": "$ssl_cipher"'
  '}';

Nginx error logs are not configurable, so we need to parse them using a regexp.

Steps to replicate

This is our current configuration:

<system>
  log_level debug
</system>

<source>
  @type forward
  port  24224
  bind  0.0.0.0
</source>

<match nginx>
  @type copy
  <store>
    @type  relabel
    @label @nginx
  </store>
</match>

<match **>
  @type stdout
</match>

<label @nginx>
  # Split nginx access and error log entries.
  <match nginx>
    @type rewrite_tag_filter
    <rule>
      key     log
      pattern /^\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}.*$/i
      tag     ${tag}.error
    </rule>
    <rule>
      key     log
      pattern .*
      tag     ${tag}.access
    </rule>
  </match>

  <filter nginx.access>
    @type        parser
    key_name     log
    reserve_data true
    <parse>
      @type    json
    </parse>
  </filter>

  <filter nginx.error>
    @type        parser
    key_name     log
    reserve_data true
    <parse>
      @type       regexp
      expression  /^(?<logtime>\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2}) \[(?<log_level>[^\s]+)\] (?<message>.*)$/
      time_key    logtime
      time_type   string
      time_format "%Y/%m/%d %H:%M:%S"
      keep_time_key true
    </parse>
  </filter>

  <match nginx.access>
      @type copy
      <store>
          @type  relabel
          @label @output
      </store>
  </match>

  <match nginx.error>
      @type copy
      <store>
          @type  relabel
          @label @output
      </store>
      <store>
          @type  relabel
          @label @output_stdout
      </store>
  </match>
</label>

<label @output>
  <match **>
    @type              elasticsearch
    host               <host>
    port               <port>
    user               <user>
    password           <password>
    index_name         fluentd.%Y%m%d
    logstash_format    false
    suppress_type_name true
    include_timestamp  false
    include_tag_key    true

    <buffer tag, time>
      @type        memory
      timekey      1m
      timekey_wait 5s
      flush_interval 5s
    </buffer>
  </match>
</label>

<label @output_stdout>
  <match **>
    @type stdout
  </match>
</label>

Expected Behavior or What you need to ask

The above configuration parses log timestamps in both logs fine and they are recognised as date types. nginx $time_iso8601 variable is formatted like this: 2022-06-14T07:30:53+00:00. Error log timestamp is formatted like this: 2022/06/14 07:41:19.

Then I realised I end up with two different log time fields: timestamp (parsed from nginx access log entries) and logtime (parsed from nginx error log entries). Kibana requires a single timestamp field to be selected for time-based index patterns.

Trivial, I thought, I'll just rename regexp group logtime to timestamp when parsing nginx error log. The problem is that Elasticsearch then rejects nginx error log entries with:

dump an error event: error_class=Fluent::Plugin::ElasticsearchErrorHandler::ElasticsearchError error="400 - Rejected by Elasticsearch [error type]: mapper_parsing_exception [reason]: 'failed to parse [timestamp]'" location=nil tag="nginx.error" time=2022-06-14 07:41:20.000000000 +0000 record={....}

I can see timestamp field in the offending record, formatted as 2022/06/14 07:41:20. If I use any other field name (like logtime), the parsed data is the same, ES accepts entries, and the field shows up as a date field in Kibana.

Is timestamp a reserved/special field name in ES? nginx.access entries with timestamp field are accepted by ES, with the format 2022-06-14T07:30:53+00:00.

Is the timestamp issue even the plugin issue, or a fluentd issue?

Thank you!

Using Fluentd and ES plugin versions

Docker image based off the official fluent/fluentd:v1.14-1 Docker image (we replace ES libs with v7.7 and install this plugin).

$ fluentd --version
fluentd 1.14.6
$ fluent-gem list
*** LOCAL GEMS ***

async (1.30.1)
async-http (0.54.0)
async-io (1.33.0)
async-pool (0.3.9)
benchmark (default: 0.1.0)
bigdecimal (1.4.4)
bundler (2.3.11)
cgi (default: 0.1.0.1)
concurrent-ruby (1.1.10)
console (1.15.0)
cool.io (1.7.1)
csv (default: 3.1.2)
date (default: 3.0.3)
dbm (default: 1.1.0)
delegate (default: 0.1.0)
did_you_mean (default: 1.4.0)
elasticsearch (7.7.0)
elasticsearch-api (7.7.0)
elasticsearch-transport (7.7.0)
etc (default: 1.1.0)
excon (0.92.3)
ext_monitor (0.1.2)
faraday (1.10.0)
faraday-em_http (1.0.0)
faraday-em_synchrony (1.0.0)
faraday-excon (1.1.0)
faraday-httpclient (1.0.1)
faraday-multipart (1.0.4)
faraday-net_http (1.0.1)
faraday-net_http_persistent (1.2.0)
faraday-patron (1.0.0)
faraday-rack (1.0.0)
faraday-retry (1.0.3)
fcntl (default: 1.0.0)
fiber-local (1.0.0)
fileutils (default: 1.4.1)
fluent-config-regexp-type (1.0.0)
fluent-plugin-elasticsearch (5.2.2)
fluent-plugin-rewrite-tag-filter (2.4.0)
fluentd (1.14.6)
forwardable (default: 1.3.1)
getoptlong (default: 0.1.0)
http_parser.rb (0.8.0)
ipaddr (default: 1.2.2)
irb (default: 1.2.6)
json (2.4.1)
logger (default: 1.4.2)
matrix (default: 0.2.0)
msgpack (1.5.1)
multi_json (1.15.0)
multipart-post (2.2.0)
mutex_m (default: 0.1.0)
net-pop (default: 0.1.0)
net-smtp (default: 0.1.0)
nio4r (2.5.8)
observer (default: 0.1.0)
oj (3.10.18)
open3 (default: 0.1.0)
openssl (default: 2.1.3)
ostruct (default: 0.2.0)
prime (default: 0.1.1)
protocol-hpack (1.4.2)
protocol-http (0.21.0)
protocol-http1 (0.13.2)
protocol-http2 (0.14.2)
pstore (default: 0.1.0)
psych (default: 3.1.0)
racc (default: 1.4.16)
readline (default: 0.0.2)
readline-ext (default: 0.1.0)
reline (default: 0.1.5)
rexml (default: 3.2.3.1)
rss (default: 0.2.8)
ruby2_keywords (0.0.5)
serverengine (2.2.5)
sigdump (0.2.4)
singleton (default: 0.1.0)
stringio (default: 0.1.0)
strptime (0.2.5)
strscan (default: 1.0.3)
timeout (default: 0.1.0)
timers (4.3.3)
tracer (default: 0.1.0)
tzinfo (2.0.4)
tzinfo-data (1.2022.1)
uri (default: 0.10.0)
webrick (default: 1.6.1)
yajl-ruby (1.4.2)
yaml (default: 0.1.0)
zlib (default: 1.1.0)

ES version: 5.4.1

bozho commented 2 years ago

Huh, this appears to be an Elasticsearch issue. When I change both timestamp field names to logtime (i.e. both work nginx access and error logs), I get the same ES parsing error for logtime field.

It's as if ES automatically picks up date string format from initial entries (nginx access logs entries are much more frequent) and then uses that string format to parse all future date strings. Once there's an incoming nginx error log entry with a differently formatted date/time in its logtime field, ES complains.