uken / fluent-plugin-elasticsearch

Apache License 2.0
891 stars 310 forks source link

"message" element and Field Parsing #381

Closed TamerDev closed 6 years ago

TamerDev commented 6 years ago

Hi, I'm using Elastic 6.2.2 on windows with Fluentd td-agent v3.1.1, I'm using the fluend plugin for elastic search, plugin version 1.0.2.

I have .Net program that writes the messages to Fluend td-agent using Nlog (Nlog to FluentD to Elastic), Fluentd td-agent writes out the incoming forward message to 2 outputs: a local file, and to Elastic.

I have an issue with the message that gets written by the td-agent, it seems that the td-agent "injects" an element that is breaking the json parsing when the message gets to Elastic.

First, the .Net program message written using NLog, and for debugging purposes: this is what it looks like before it reaches the fluentd td-agent: {"ID":2940,"Action":"RunSomething","Date":"2018-03-13T17:53:58"}

The td-agent, reads the message and here is what it writes out to the text file output: 2018-03-13T17:53:58-04:00 my.logs {"message":"{\"ID\":2940,\"Action\":\"RunSomething\",\"Date\":\"2018-03-13T17:53:58\"}"}

As you can see, it added a timestamp, "my.logs" which is the tag I'm using, and then it enclosed my original json in a "message" parent node which I do not need and I did not ask for.

When checking for the Json that got inserted into Elastic, I get this from Kibana: { "_index": "simple", "_type": "entry", "_id": "qKlZIWIBtckwupyAvCEL", "_version": 1, "_score": 2.330756, "_source": { "message": "{\"ID\":5074,\"Action\":\"RunSomething\",\"Date\":\"2018-03-13T17:52:29\"}" }, "highlight": { "message": [ "{\"ID\":5074,\"Action\":\"@kibana-highlighted-field@RunSomething@/kibana-highlighted-field@\",\"Date\":\"2018-03-13T17:52:29\"}" ] } }

so there are two options here that I need help with: 1- How can I parse and recognize the "Date" field as it shows now in Elastic (with keeping the "message" parent as it shows above). 2- Or, how do I eliminate the "message" parent node so that I have access to the main elements of my json (ID, RunSomething,Date).

I read through dynamic mapping and setting it to strict, but I do not want to disable dynamic mapping because some new fields could show up in the future, I need to keep it dynamic.

Any thoughts?

Here is the full td-agent config:

Source

<source> @type forward #the most effecient parser from tcp, it cannot parse incoming messages port 24224 bind 0.0.0.0 #from all addresses </source>

Output

<match my.logs> @type copy <store> @type file path C:\opt\td-agent\etc\td-agent\manuallog.txt </store> <store> @type elasticsearch logstash_format false host localhost port 9200 index_name simple type_name entry <buffer tag,time> flush_mode interval retry_type exponential_backoff flush_interval 1 timekey 1s # chunks per hours ("3600" also available) timekey_wait 1s </buffer> </store> </match>

cosmo0920 commented 6 years ago

1- How can I parse and recognize the "Date" field as it shows now in Elastic (with keeping the "message" parent as it shows above).

Maybe using record-modifier plugin can parse it.

2- Or, how do I eliminate the "message" parent node so that I have access to the main elements of my json (ID, RunSomething,Date).

"message" node is hard-coded in NLog: https://github.com/fluent/NLog.Targets.Fluentd/blob/master/src/NLog.Targets.Fluentd/Fluentd.cs#L289

AFAIK, there is no plugin to handle and take out values in nested record.

cosmo0920 commented 6 years ago

And this issue is not ES plugin issue. If you want to obtain more concrete answers, please send your question in Fluentd mailing list or Fluentd community slack.