fluent-plugins-nursery / fluent-plugin-remote_syslog

Fluentd plugin for output to remote syslog serivce (e.g. Papertrail)
https://github.com/dlackty/fluent-plugin-remote_syslog
MIT License
68 stars 53 forks source link

Message time sent through syslog_protocol is Time.now instead of original log timestamp #48

Open ptrovatelli opened 2 years ago

ptrovatelli commented 2 years ago

Hello, We're using fluentd td-agent to get logs from linux servers (/var/log/secure) and send them to a remote destination using https://github.com/reproio/remote_syslog_sender and https://github.com/eric/syslog_protocol

We would like to keep the original log timestamp in place of the syslog message timestamp when sending the syslog message to the destination. However, it seems that the original log timestamp is overwritten by Time.now = the time when the packet is sent.

We're using TCP and syslog RFC 3164

This is and extract of our td-agent configuration:

    <source>
      @type tail
      path /var/log/secure
      pos_file /var/log/td-agent/buffer/secure.pos
      tag xxx.sys.yyy.secure
      format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
      enable_watch_timer false
    </source>
  <match **.sys.**secure>
    @type remote_syslog
    @id soc
    <buffer>
      @type file
    </buffer>
    host xxxxx
    port 514
    protocol tcp
    packet_size 20480
    severity debug
  </match>

Example log file:

Mar 24 11:48:40 myhostname sshd[25533]: reprocess config line 126(...)

We have captured the network packet produced by the plugin: we can see that the syslog timestamp is equal to the time of packet sending (11:49:47 truncated at the second) instead of the original log timestamp (11:48:40)

wireshark_screenshot_1

wireshark_screenshot_2

What we see:

We would like to have the original log timestamp here, as parsed by the td-agent configuration "time" variable.

I believe that https://github.com/eric/syslog_protocol supports it: here it is getting the timestamp from the message and putting Time.now only if time is not found or PRI in incorrect:

https://github.com/eric/syslog_protocol/blob/master/lib/syslog_protocol/parser.rb#L9

    if pri and (pri = pri.to_i).is_a? Integer and (0..191).include?(pri)
      packet.pri = pri
    else
      # If there isn't a valid PRI, treat the entire message as content
      packet.pri = 13
      packet.time = Time.now
      packet.hostname = origin || 'unknown'
      packet.content = original_msg
      return packet
    end
    time = parse_time(msg)
    if time
      packet.time = Time.parse(time)
    else
      packet.time = Time.now
    end

Thanks!

ptrovatelli commented 2 years ago

@joker1007 what's your take on this?

ptrovatelli commented 2 years ago

@cosmo0920 what do you think?

we are willing to do the change in the code. will you merge it then?

a change will be required in https://github.com/reproio/remote_syslog_sender too. we can do both.

thanks