severb / graypy

Python logging handler for Graylog that sends messages in GELF (Graylog Extended Log Format).
https://www.graylog.org/
BSD 3-Clause "New" or "Revised" License
258 stars 90 forks source link

Graypy allows chunk size greater than 128 #116

Closed gabrieljardim closed 4 years ago

gabrieljardim commented 5 years ago

According to the specification "A message MUST NOT consist of more than 128 chunks.". https://docs.graylog.org/en/3.0/pages/gelf.html There's a bug where graypy allows a chunk size between 128 and 256. The error is perceived when another API tries to decode a large gelf message encode by graypy.

On handler.py the validation should be on this line:

self.pieces = \ struct.pack('B', int(math.ceil(len(message) * 1.0 / size))) self.id = struct.pack('Q', random.randint(0, 0xFFFFFFFFFFFFFFFF))

I could send an PR for this but I'd have to choose an exception.

Regards!

nklapste commented 5 years ago

Good find! Picking a exception or error handling pattern for this is a tough decision.

One hand I would like graypy to be 100% compliant to Graylog's GELF specification. Throwing a exception when graypy can't chunk a message when it is too large would respect this. However, I feel this could be a error that surprises some developers.

On the other hand, I would like to prevent the chance of graypy exploding a script due to a "non-critical" logging error/fault. Having a log drop is a bad state to be in, but, having a script halt due to log dropping is also bad.

Maybe graypy should drop the log, but, do so as explicitly as possible without halting primary execution?

Maybe there are alternatives to dealing with extremely large logs?

What is your opinion @gabrieljardim?

gabrieljardim commented 5 years ago

I understand your point. Throwing an exception inside a log library may not be the best option. I tried to think a little bit about the matter and came up with some possible solutions:

Truncate payload to 128 chunks so user could at least see part of the log. Replace payload for an error message explaining that log was too long for gelf format. Truncate payload to 128 chunks and replace, for example, the last 150 bytes with an error message explaining that the log was truncated because the chunk count exceeded 128 defined by gelf.

What do you think @nklapste?

nklapste commented 5 years ago

I like your last option, however, I would add the error statement noting truncation at the start of the extremely long log, just for brevity in identification. Or the log truncated error statement could be added into a extra GELF field (e.g. _error_truncated).

Your thoughts @gabrieljardim?

Maybe for future use a hook should be added to the graypy handlers' (e.g. on_chunk_overflow) that can be overridden by subclasses for enabling custom log handling in the case of a extremely long log.

gabrieljardim commented 5 years ago

For visibility purpose I really like your first idea, with error statement being the first thing in the log. I also like the idea of having a hook that could enable overriding the default behavior in the future.