Graylog2 / graylog2-server

Free and open log management
https://www.graylog.org
Other
7.43k stars 1.07k forks source link

Incorrect parsing of RFC5424 syslog input #4689

Open pac-work opened 6 years ago

pac-work commented 6 years ago

Examples of lines which are not parsed correctly:

full_message
<34>1 2018-03-25T22:14:15.003Z mymachine.example.com su - ID47 - 'su root' failed for lonvick on /dev/pts/8
---
full_message
<131>1 2018-03-26T08:19:40.760962Z host_pac amstest_libamsloggingtools 11348 Log2 - (SyslogTCPLogSink.cpp:35) Value 16

First example line is taken directly from the RFC5424 examples, only the date has been modified and BOM removed (which is OK according to the RFC5424 grammar).

Expected Behavior

Fields msg_id (or similar) and message should be parsed out properly. For the first example message, I would expect msg_id to be ID47 and message to be 'su root' failed for lonvick on /dev/pts/8.

Current Behavior

For the first example message, I get only message field:

message
ID47 - 'su root' failed for lonvick on /dev/pts/8

It seems that the input parser ignores the fact that the - in the original message is STRUCTURED-DATA = NILVALUE in the above mentioned grammar, not part of any message. In this example, there should be no - in the message at all. The msg_id field seems to be currently completely ignored by the Graylog.

Similarly in the second example, Graylog reports:

message
Log2 - (SyslogTCPLogSink.cpp:35) Value 16

But expected would be to have msg_id of Log2 and message (SyslogTCPLogSink.cpp:35) Value 16.

Steps to Reproduce (for bugs)

Just send above mentioned example messages to the graylog server.

Context

The documentation states that:

Graylog is able to accept and parse RFC 5424...

But unfortunately, it is not able to parse even the example line from the mentioned RFC.

joschi commented 6 years ago

For reference: https://github.com/Graylog2/graylog2-server/blob/2c418975989da2fff589e35cdf8c00ff41983e52/graylog2-server/src/test/java/org/graylog2/inputs/codecs/SyslogCodecTest.java#L265-L323

pac-work commented 6 years ago

@joschi Thanks for pointing out this source code. As you can see, the tests are incorrect - they are even inconsistent. If you look at the first test, you see that ID47 is expected as a part of the message (which is incorrect), whereas in the third and fourth test the ID47 is completely ignored (incorrect as well) and not part of the message (correct).

Correct solution is to not ignore ID47, but correctly parse it as msg_id or something similar.