SumoLogic / fluentd-output-sumologic

Fluentd output plugin to deliver logs or metrics to Sumo Logic.
https://rubygems.org/gems/fluent-plugin-sumologic_output
Apache License 2.0
29 stars 28 forks source link

_sumo_metadata field not stripped off before posting to SumoLogic #33

Closed CH-ShivikaJindal closed 6 years ago

CH-ShivikaJindal commented 6 years ago

Info

  1. plugin version: fluent-plugin-sumologic_output (1.3.1)
  2. td-agent version: td-agent 1.2.2

Problem Summary fluentd-output-sumologic plugin does not work. _sumo_metadata is not stripped off at all, from the log.

Background Our setup involves sending logs from syslog-ng server to td-agent server. We are using hosted collector for sending logs to sumologic. While on both the ends, the log messages looks exactly the same. This consists of having the sumo_metadata key and the corresponding values.

Expectation

/test should appear in _sourceCategory. **Configuration** ``` @type tcp tag tcp.events @type json port 24225 bind 127.0.0.1 @type http port 8888 bind 127.0.0.1 @type copy @type file path timekey 1s timekey_use_utc true timekey_wait 1s @type sumologic endpoint https://collectors.us2.sumologic.com/receiver/v1/http/*** log_format text open_timeout 10 flush_interval 1s ``` **Excerpt from Syslog-ng server log** ``` {"journal":{_HOSTNAME":"ip -172-31-22-252","_GID":"0","_EXE":"/usr/bin/dockerd","_COMM":"dockerd","_CMDLINE":"/usr/bin/dockerd --raw-logs","TEST":"false","SYSLOG_IDENTIFIER":"confident_leavitt/test","PRIORITY": "6","MESSAGE":"Hello World!","LOCATION":"west","CONTAINER_TAG":"confident_leavitt/test","CONTAINER_NAME":"confident_leavitt"},"_sumo_metadata":{"source":"journal","host":"ip-172-31-22-252","category":"confident_leavitt/test"},"TAGS":".source.s_src","SOURCEIP":"127.0.0.1","SOURCE":"s_src","PROGRAM":"confident_leavitt/test","PRIORITY":"in fo","PID":"28XXX","MESSAGE":"Hello World!","HOST_FROM":"ip-172-31-22-252","HOST":"ip-172-31-22-252","DATE":"Sep 21 00:52:51"} ``` **Excerpt from td-agent server log** ``` Sep 21 00:52:51 172.31.22.252 confident_leavitt/test[28XXX]: {"journal":{_HOSTNAME":"ip -172-31-22-252","_GID":"0","_EXE":"/usr/bin/dockerd","_COMM":"dockerd","_CMDLINE":"/usr/bin/dockerd --raw-logs","TEST":"false","SYSLOG_IDENTIFIER":"confident_leavitt/test","PRIORITY": "6","MESSAGE":"Hello World!","LOCATION":"west","CONTAINER_TAG":"confident_leavitt/test","CONTAINER_NAME":"confident_leavitt"},"_sumo_metadata":{"source":"journal","host":"ip-172-31-22-252","category":"confident_leavitt/test"},"TAGS":".source.s_src","SOURCEIP":"127.0.0.1","SOURCE":"s_src","PROGRAM":"confident_leavitt/test","PRIORITY":"in fo","PID":"28XXX","MESSAGE":"Hello World!","HOST_FROM":"ip-172-31-22-252","HOST":"ip-172-31-22-252","DATE":"Sep 21 00:52:51"} ``` **Steps to reproduce:** 1. `docker run --log-driver=journald --log-opt tag="{{.Name}}/test" --log-opt labels=location --log-opt env=TEST --env "TEST=false" --label location=west ubuntu echo "Hello World!"` **More information:** Sumologic receives the log in the same format as mentioned above. Yet, the _sourceCategory only has Http Input in the list. And I am using `_collector=` in the sumo search field.
frankreno commented 6 years ago

@CH-ShivikaJindal: In order for this feature to work, the log message must be in JSON format.

Our setup involves sending logs from syslog-ng server to td-agent server.

The Syslog-ng server log is in JSON format (however the JSON looks malformed as _HOSTNAME is missing a leading quote), however the td-agent server is not, there is a date and some additional data at the beginning and this is not JSON. In order for the _sumoMetadata to be stripped, it must enter fluentD in JSON format so it can be stripped and the metadata can be properly set.

I tested the following using the unit tests and manually and it worked as expected:

{
        "journal": {
            "_HOSTNAME": "ip-172-31-22-252",
            "_GID": "0",
            "_EXE": "/usr/bin/dockerd",
            "_COMM": "dockerd",
            "_CMDLINE": "/usr/bin/dockerd --raw-logs",
            "TEST": "false",
            "SYSLOG_IDENTIFIER": "confident_leavitt/test",
            "PRIORITY": "6",
            "MESSAGE": "Hello World!",
            "LOCATION": "west",
            "CONTAINER_TAG": "confident_leavitt/test",
            "CONTAINER_NAME": "confident_leavitt"
        },
        "_sumo_metadata": {
            "source": "journal",
            "host": "ip-172-31-22-252",
            "category": "confident_leavitt/test"
        },
        "TAGS": ".source.s_src",
        "SOURCEIP": "127.0.0.1",
        "SOURCE": "s_src",
        "PROGRAM": "confident_leavitt/test",
        "PRIORITY": "info",
        "PID": "28XXX",
        "MESSAGE": "Hello World!",
        "HOST_FROM": "ip-172-31-22-252",
        "HOST": "ip-172-31-22-252",
        "DATE": "Sep 21 00:52:51"
    }
frankreno commented 6 years ago

You would need to ensure that the log format is valid JSON for the _sumoMetadata feature to work.