Closed qrilka closed 9 years ago
I can see how adding a max_downtime_buffer_size
option can help.
Integrating a test for this will not be trivial, but I have something in mind. Will give you a branch with the fix anyway.
Thanks
The option ekaf_max_downtime_buffer_size
has been implemented in https://github.com/helpshift/ekaf/commit/416d3004533c277938c25da12e250328be5e31ad Added tests as well. Merged in master. Pushed to tag https://github.com/helpshift/ekaf/releases/tag/1.5
You can set the value like so
application:set_env(ekaf, ekaf_max_downtime_buffer_size, 5)
You can subscribe to this event (eg: for alerting) by adding a callback like
%-include("ekaf_definitions.hrl").
application:set_env(ekaf, ?EKAF_CALLBACK_MAX_DOWNTIME_BUFFER_REACHED, {?MODULE, callback})
See ekaf_demo.erl
and test/ekaf_tests.erl
for more on callbacks, ekaf options.
That should resolve memory overflow problems but because of problems described in https://github.com/helpshift/ekaf/issues/6 we'll disable ekaf in our production
I see
max_buffer_size
but it has a different meaning than just maximum buffer size (rather maximum number of async messages to buffer). We had Kafka outage and that resulted in our server crashing with OOM because messages just kept collected in memory. Should we have some workaround inekaf
for this problem? I.e. some "total_buffer_size" limiting number of messages kept in memory (ingoring other messages if they appear after hitting that limit). Normally Kafka should be very reliable but that is used for logging only in our system so it make sense to keep working even if logging has some problems (but warning about that should be issued of course). Any opinion on this?