wizecore / graylog2-output-syslog

Customizable, production ready syslog and ArcSight output plugin for Graylog
Apache License 2.0
39 stars 20 forks source link

UTF-8 support - rfc5424 #25

Closed adamsh25 closed 4 years ago

adamsh25 commented 6 years ago

Hi,

I have issues with UTF-8 support, syslog messages with rfc5424 must contain the (BOM) prefix: " If a syslog application encodes MSG in UTF-8, the string MUST start with the Unicode byte order mask (BOM), which for UTF-8 is ABNF %xEF.BB.BF. The syslog application MUST encode in the "shortest form" and MAY use any valid UTF-8 sequence."

https://tools.ietf.org/html/rfc5424

f.g the German letter won't be supported - because the message data will be decoded to ASCII and not to UTF-8, exploring Wireshark packet sent with this plugin output stream, will result in a message that do not have the (BOM) prefix.

Thank you, Adam.

huksley commented 5 years ago

Sorry for being late with this... Yes, as I can see in RFC, UTF-8 messages require BOM marker to be added.

What is your thoughts how we should proceed with this?

adamsh25 commented 5 years ago

Thank you for your response, I think the second option is the best, just add a combo box that will let us decide the encoding or a check box for BOM header - default false is ok.

huksley commented 5 years ago

Hmm, I am having a difficulty validating changes for this. rsyslog shows the message as is, not stripping any characters if no BOM mark exists.

Can you please advise on how to setup the testing environment for this? Wireshark?

huksley commented 4 years ago

@adamsh25 I implemented BOM support in branch utf8 Could you check is it working in your environment? Thank you!

https://github.com/wizecore/graylog2-output-syslog/pull/37

huksley commented 4 years ago

Merged #37 and fixed this