Closed edefaria closed 8 years ago
Can you double check that
framing = "nul"
is present in your [input]
section?
yes, the framing is "nul"
As a temporary workaround, 4fe397ff92bc395c11e5f74b7f2435b7be3d4dd8 downgrades serde_json
to the 0.7
branch.
Thank you very much for the quick fix!
Actually, I'm not sure this is a regression. The behavior of serde_json
0.8 appears to be correct, and actually fixes a lack of proper validation that was present in the 0.7 series.
According to the JSON specification, characters in unicode range U+0000
to U+001F
, which includes\n
, are control characters. These must be escaped.
This is consistent with Javascript parsers that refuse straight newlines in JSON values:
JSON.parse('{"t":"x\ny"}')
throws a SyntaxError: Unexpected token in JSON at position 7
exception.
Yes, I know this does not pass most of json validator with '\n'. But in gelf specification: http://docs.graylog.org/en/2.0/pages/gelf.html Character '\n' is allowed in example payload.
I think there is a misunderstanding of the GELF specification, more specifically of the example they provide.
{ "full_message": "Backtrace here\nmore stuff" }
is the final content of a single JSON string. No replacements will be applied to the content within quotes. \n
is and will remain parsed as the character \
followed by the character n
.
This is not equivalent to
{ "full_message": "Backtrace here
more stuff" }
which is a different string (and not a valid JSON string).
JSON strings should be considered as raw strings, not as higher-level strings as present in several programming languages, that will eventually be converted to raw strings.
As a result, the GELF specification does not allow straight 0x0a
characters within strings, and clients depending on this behavior are broken.
A note was added to the Wiki in order to clarify this.
Producing conformant GELF messages should never be an issue with client libraries using JSON encoders.
What GELF client libraries producing invalid JSON strings did you use?
I tested Logstash and other gelf library, all of them parse json directly. For my simple test given previously, I do mistake from the payload, I must protect the "\n" in echo command to be valid JSON.
I have some client which I do not know their GELF clients (mostly apache logs), that generates "new line special character" in their JSON in their value. It breaks like 10-20% of their logs sent to flowgger when I upgraded to 0.2.0. This behaviour was new, so I was worried at first. Issue like https://github.com/Graylog2/graylog2-server/issues/2048
Has described in graylog's issue, GELF must be a valid JSON to be decode.
Valid test with echo command of GELF/TCP+TLS:
echo -e '{"version":"1.1", "host": "example.org", "short_message": "A short GELF message that helps you identify what is going on", "full_message": "Backtrace here\\n\\nmore stuff", "timestamp": 1470749969, "level": 1, "_user_id": 9001, "_some_info": "foo", "some_metric_num": 42.0}\0' | openssl s_client -quiet -no_ign_eof -connect localhost:12202
Example of working message on flowgger 0.1.X to gelf input with multiline:
echo -e '{"version":"1.1", "host": "example.org", "short_message": "A short GELF message that helps you identify what is going on", "full_message": "Backtrace here\n\nmore stuff", "timestamp": 1470749969, "level": 1, "_user_id": 9001, "_some_info": "foo", "some_metric_num": 42.0}\0' | openssl s_client -quiet -no_ign_eof -connect localhost:12202
Now with flowgger 0.2.0, I have the following error: