phaistos-networks / TANK

A very high performance distributed log service
Apache License 2.0
940 stars 70 forks source link

Consume responses - compression #54

Open markpapadakis opened 7 years ago

markpapadakis commented 7 years ago

We currently don't compress the consume responses (i.e message sets streamed from the server to the client), because we rely on sendfile() -- but maybe, depending on the size of bytes we need to stream, it would be worth it to (read, compress, stream) instead, thereby incurring the kernel to user space copy, compression etc, but still make more sense.

However, we should probably reserve this kind of behaviour for when really larges amounts of data are to be streamed (e.g over 10MBs), and in this case, we should rely on some background threads pool to (read, compress) that data, before handing it off back to the main thread for streaming it, because we really don't want to block the main thread (read can block for longer than we anticipated, and compression may take far longer than we thought). Furthermore, we may need to employs some kind of more elaborate heuristics there for deciding when to do that or not, and not just rely on the span/range of data to stream.

This could be really beneficial when, for example, replaying a partition's worth of events.