fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.89k stars 1.59k forks source link

out_gelf: Fluent Bit is refusing to send what Graylog accepts as a GELF message #2256

Closed alitoufighi closed 3 years ago

alitoufighi commented 4 years ago

Bug Report

My story with log levels never ends.

Describe the bug When the log level is an integer, we check its value according to the GELF payload specification that chooses the value to be a number between 0 and 7.

https://github.com/fluent/fluent-bit/blob/46e479eb2d1d5f6fe57b1ad7bd395abb5e7bdb34/src/flb_pack_gelf.c#L559-L563

But the fact is that this field is optional and Graylog doesn't do anything special with it (Apparently, at least since Graylog 2.x).

I mean, here I successfully sent a GELF message with level 50 (Without Fluent Bit GELF output plugin): image

It is usual for logging libraries to not follow syslog severity levels, so that lots of applications may have difficulties to convert these levels to what GELF (specifically, in Flutent Bit plugin) asks.

But I believe Fluent Bit mustn't be a "pain in the ass" for the person who is responsible to facilitate the management of logs. So I think allowing this field to be any arbitrary integer, and forward it anyway, and let the destination server decide if it accepts it or not, can be a better idea.

To Reproduce Create a log with level key that is not between 0 and 7:

{"log":"foobar", "level":10}

and GELF output plugin will complain:

[flb_msgpack_to_gelf] level is 10, but should be in 0..7 or a syslog keyword

Expected behavior I expect this log to be sent to the server, and if the server rejects it, it's not the problem with Fluent Bit. This field is optional and I think it being optional has made the specifications of it optional as well.

Your Environment

alitoufighi commented 4 years ago

@manuelluis and @edsiper I'd be happy to hear your feedbacks.

alitoufighi commented 4 years ago

Using this configuration:

[INPUT]
        Name   dummy
        Tag    kube.dummy
        Dummy {"level": 30, "time": "2020-06-13T15:02:12.637309", "host": "inja", "short_message":"{\"key1\":\"val1\"}"}

[OUTPUT]
        Name                    gelf
        Match                   *
        Host                    my-graylog-instance
        Port                    12201
        Mode                    tcp
        Gelf_Short_Message_Key  short_message

Simply if we remove the return NULL from this scope (when level is int): https://github.com/fluent/fluent-bit/blob/46e479eb2d1d5f6fe57b1ad7bd395abb5e7bdb34/src/flb_pack_gelf.c#L549-L555 This works: image

And if we use a string log level by changing the dummy message above to this:

{"level": "ERROR", "time": "2020-06-13T15:02:12.637309", "host": "inja", "short_message":"{\"key1\":\"val1\"}"}

And remove the return NULL instruction from here: https://github.com/fluent/fluent-bit/blob/46e479eb2d1d5f6fe57b1ad7bd395abb5e7bdb34/src/flb_pack_gelf.c#L580-L584

The only complain is from the Elasticsearch side where it requires level to be of type long:

error=<{"type":"mapper_parsing_exception","reason":"failed to parse field [level] of type [long] in document with id 'cb767b26-ae14-11ea-936f-00505680ecd6'","caused_by":{"type":"illegal_argument_exception","reason":"For input string: \"ERROR\""}}>

In this case also I believe if someone likes to send his log levels in string format, she can simply change data type of that field to string in ES and this shouldn't be a concern for Fluent Bit.

alitoufighi commented 3 years ago

Closing in favor of merging https://github.com/fluent/fluent-bit/pull/2257