StephenWakely / syslog-loose

A loose parser for Syslog messages
MIT License
10 stars 8 forks source link

RFC3164 should not provide structured data parsing #37

Open itkovian opened 2 months ago

itkovian commented 2 months ago

According the the RFC, the syslog line comprises the following "fields":

<PRI>TIMESTAMP HOSTNAME TAG: MESSAGE

Afaik, there is no mention of any structured data, yet the rfc3164 parser optionally provides support for this. This makes parsing log lines that adhere to rfc3164, but contain a [<text>] at the MESSAGE start impossible to parse correctly.

StephenWakely commented 2 weeks ago

This parser is not designed to conform exactly to the specs - hence the name loose. The problem is that not everything conforms to exactly 3164 or 5424. Looking at the tests here it looks like rsyslog produces 3164 messages - but also includes structured data. This parser was written with the aim to cater for what is out there rather than be exact.

Of course when you try to keep everyone happy problems such as this arise.

I think the best thing here is to be a bit stricter in the structured data parsing.

Currently if the structured data is invalid, with 3164 this is then treated as a part of the message. However structured data with just the id and no key value pairs is valid structured data. We should be a bit stricter and treat this is invalid for 3164.