tremor-rs / tremor-runtime

Main Tremor Project Rust Codebase
https://www.tremor.rs
Apache License 2.0
858 stars 125 forks source link

Enhancements for the gelf-chunking postprocessor #820

Open anupdhml opened 3 years ago

anupdhml commented 3 years ago

Describe the problem you are trying to solve

The gelf-chunking postprocessor uses a numeric message id that is incremented on each new message. To better guarantee uniqueness when multiple tremor instances are sending chunked gelf messages to the same receiver, we should choose a different scheme for id.

Also, if the total message size is less than the individual chunk size, we can choose to send the message unchunked -- the gelf protocol supports that. For small-enough messages, this will eliminate the chunking overhead both on the client and server side.

Describe the solution you'd like

For message id, gelf docs mention using a combination of hostname and timestamp.

If the length of data that the preprocessor gets is less than the chunk size, send the data as is (i.e. do not apply the chunking logic).

Notes

Relevant implementations from the official java client:

https://github.com/Graylog2/gelfclient/blob/gelfclient-1.5.1/src/main/java/org/graylog2/gelfclient/encoder/GelfMessageChunkEncoder.java#L82-L91

https://github.com/Graylog2/gelfclient/blob/gelfclient-1.5.1/src/main/java/org/graylog2/gelfclient/encoder/GelfMessageChunkEncoder.java#L131-L134

Php client implementation for the same: https://github.com/bzikarsky/gelf-php/blob/1.7.0/src/Gelf/Transport/UdpTransport.php#L88-L96

rameels commented 1 year ago

Hi @darach could I please be assigned this issue? Decided to pick it up as a way of learning Rust. 😄

mfelsche commented 1 year ago

@rameels You are hereby assigned! Thanks for your initiative. Feel free to drop any questions you might have here or on our discord: https://chat.tremor.rs/