fluent / fluent-logger-python

A structured logger for Fluentd (Python)
http://fluentd.org/
Other
446 stars 138 forks source link

Support `chunk` option to enable "at-least-once" delivery #120

Open fujimotos opened 6 years ago

fujimotos commented 6 years ago

Background

According to the "Forward Protocol Specification v1", fluentd supports an option named chunk which enables at-least-once delivery of messages.

This option is very useful in cases where data loss is not acceptable.

https://github.com/fluent/fluentd/wiki/Forward-Protocol-Specification-v1#option

The problem

The current design of fluent-logger-python, however, makes it difficult to support this new option. Specifically:

  1. Events are buffered inside FluentSender class as a single bytes sequence (self.pendings). There is no efficient way to reconstruct a specific event from the buffer and resend it.
  2. And this bytes sequence buffer is kinda API. So we cannot moddify the format in which FluentSender buffers messages (at least, casually) or it will break many user-defined buffer_overflow_handlers.
  3. Also for now, we lack a handful of building blocks for supporting the "at-least-once" semantics. For example, there is no reliable mechanism for users to tell if a message has been delivered successfully [^]

So we need to ...

The bottom line is, we need to apply some architectural changes to make this library support the (newly-introduced) "at-least-once" semantics. Of course, we need to do it without breaking many existing programs.

What do you think about this? Or is there already a plan to make this library compliant with the v1 specification?


[^] Yes, FluentSender.emit() is supposed to notify this via its return value. But even if the method returns False, the message might be delivered anyway through the pending buffer, and this "retry" part is totally opaque to users.

arcivanov commented 6 years ago

Related to #77, namely https://github.com/fluent/fluent-logger-python/issues/77#issuecomment-348924119

arcivanov commented 6 years ago

Events are buffered inside FluentSender class as a single bytes sequence (self.pendings).

Yep, I'll be fixing this.

arcivanov commented 6 years ago

And this bytes sequence buffer is kinda API. So we cannot moddify the format in which FluentSender buffers messages (at least, casually) or it will break many user-defined buffer_overflow_handlers.

Well, that's what the semver is for. If we bump to 1.0.0 we can definitely implement breaking changes.