bpaquet / node-logstash

Simple logstash implmentation in nodejs : file log collection, sent with zeromq
Other
517 stars 141 forks source link

Syslog pattern wrong? #108

Closed megastef closed 9 years ago

megastef commented 9 years ago

Hi,

I made following setup (consider 'Logsene' as an hosted Elasicsearch Cluster )

export LOGSENE_OUTPUT="output://elasticsearch://logsene-receiver.sematext.com:443?index_name=${LOGSENE_TOKEN}&ssl=true&bulk_size=100&bulk_time=100"
export SYSLOG_TCP_INPUT="input://tcp://0.0.0.0:514?type=syslog"
export SYSLOG_FILTERS="filter://regex://syslog_no_prio"
echo node-logstash-agent $SYSLOG_TCP_INPUT $SYSLOG_FILTERS $LOGSENE_OUTPUT &
node-logstash-agent --http_max_sockets 1  $SYSLOG_TCP_INPUT $SYSLOG_FILTERS $LOGSENE_OUTPUT 

When I make tests with pipes like

tail -10 /var/log/syslog | nc localhost 10000

All 10 lines are inserted as 1 message to Elasticsearch Is this a bug or or do you see an error in my config?

In addtion the command does not return and I need to interrupt it using CTRL+C on the console.

Lets debug it using stdout:

cat test.log | node-logstash-agent input://stdin:// $SYSLOG_FILTERS output://stdout://

and get this output

...
[Sat, 04 Jul 2015 19:55:43 GMT] INFO Initializing regex filter, regex : /^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:\[]+)\[?(\d*)\]?:\s+(.*)$/, fields timestamp,host,syslog_program,syslog_pid,message, date format MMM DD HH:mm:ss Z, flags: 
[Sat, 04 Jul 2015 19:55:43 GMT] INFO Initializing input Stdin
[Sat, 04 Jul 2015 19:55:43 GMT] INFO Config loaded.
[STDOUT] {
  "source": "stdin",
  "message": "Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.6'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.7'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.7'",
  "host": "7baf91ef8411",
  "@timestamp": "2015-07-04T19:55:43.337Z",
  "@version": "1"
}

Good to see it has nothing to do with our backend ... it looks the regex is wrong. Jump to node.js console.

> line
'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'\n'
> line.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:\[]+)\[?(\d*)\]?:\s+(.*)$/)
null
> line.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:\[]+)\[?(\d*)\]?:\s+(.*)/)
[ 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'',
  'Jul 4 18:41:32',
  'ip-10-13-184-69',
  'syslog-ng',
  '10158',
  'syslog-ng starting up; version=\'3.5.6\'',
  index: 0,
  input: 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'\n' ]

Conclusion: I think the pattern should not end with "$" - right? BTW its the same when I remove the "\n" at the end of the line.

> line2.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:\[]+)\[?(\d*)\]?:\s+(.*)$/)
null
> line2.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:\[]+)\[?(\d*)\]?:\s+(.*)/)
[ 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'',
  'Jul 4 18:41:32',
  'ip-10-13-184-69',
  'syslog-ng',
  '10158',
  'syslog-ng starting up; version=\'3.5.6\'',
  index: 0,
  input: 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'' ]
bpaquet commented 9 years ago

Hi,

/var/log/syslog file is not a syslog stream. Syslog protocol is documented here https://en.wikipedia.org/wiki/Syslog.

You can parse your syslog file using a regex, but not the syslog one : for example the priority does not use the same format.

Syslog filter in node-logstash is designed to be used with an udp input, to receive 'real' syslog packets. You can configure your syslog daemon (for example rsyslog) to send logs to a node-logstash input.

Bertrand

On Sat, Jul 4, 2015 at 10:30 PM, Stefan Thies notifications@github.com wrote:

Hi,

I made following setup (consider 'Logsene' as an hosted Elasicsearch Cluster )

export LOGSENE_OUTPUT="output://elasticsearch://logsene-receiver.sematext.com:443?index_name=${LOGSENE_TOKEN}&ssl=true&bulk_size=100&bulk_time=100 http://logsene-receiver.sematext.com:443?index_name=$%7BLOGSENE_TOKEN%7D&ssl=true&bulk_size=100&bulk_time=100" export SYSLOG_TCP_INPUT="input://tcp://0.0.0.0:514?type=syslog" export SYSLOG_FILTERS="filter://regex://syslog_no_prio" echo node-logstash-agent $SYSLOG_TCP_INPUT $SYSLOG_FILTERS $LOGSENE_OUTPUT & node-logstash-agent --http_max_sockets 1 $SYSLOG_TCP_INPUT $SYSLOG_FILTERS $LOGSENE_OUTPUT

When I make tests with pipes like

tail -10 /var/log/syslog | nc localhost 10000

All 10 lines are inserted as 1 message to Elasticsearch Is this a bug or or do you see an error in my config?

In addtion the command does not return and I need to interrupt it using CTRL+C on the console.

Lets debug it using stdout:

cat test.log | node-logstash-agent input://stdin:// $SYSLOG_FILTERS output://stdout://

and get this output

... [Sat, 04 Jul 2015 19:55:43 GMT] INFO Initializing regex filter, regex : /^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:[]+)[?(\d)]?:\s+(.)$/, fields timestamp,host,syslog_program,syslog_pid,message, date format MMM DD HH:mm:ss Z, flags: [Sat, 04 Jul 2015 19:55:43 GMT] INFO Initializing input Stdin [Sat, 04 Jul 2015 19:55:43 GMT] INFO Config loaded. [STDOUT] { "source": "stdin", "message": "Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.6'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.7'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version='3.5.7'", "host": "7baf91ef8411", "@timestamp": "2015-07-04T19:55:43.337Z", "@version": "1" }

Good to see it has nothing to do with our backend ... it looks the regex is wrong. Jump to node.js console.

line 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'\n' line.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:[]+)[?(\d)]?:\s+(.)$/) null line.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:[]+)[?(\d)]?:\s+(.)/) [ 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'', 'Jul 4 18:41:32', 'ip-10-13-184-69', 'syslog-ng', '10158', 'syslog-ng starting up; version=\'3.5.6\'', index: 0, input: 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'\n' ]

Conclusion: I think the pattern should not end with "$" - right? BTW its the same when I remove the "\n" at the end of the line.

line2.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:[]+)[?(\d)]?:\s+(.)$/) null line2.match (/^(\S+\s+\S+\s+\d+:\d+:\d+) (\S+) ([^:[]+)[?(\d)]?:\s+(.)/) [ 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'', 'Jul 4 18:41:32', 'ip-10-13-184-69', 'syslog-ng', '10158', 'syslog-ng starting up; version=\'3.5.6\'', index: 0, input: 'Jul 4 18:41:32 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.6\'\nJul 4 18:41:33 ip-10-13-184-69 syslog-ng[10158]: syslog-ng starting up; version=\'3.5.7\'' ]

— Reply to this email directly or view it on GitHub https://github.com/bpaquet/node-logstash/issues/108.

megastef commented 9 years ago

Thx for the clarification.