Ekaf will store the prepare request to dict and would response it when got worker-up messages, in my production case with many Kafka partitions, it needs to wait so long for the worker-up messages that will reach the timeout and exit the caller process. And also, it have a little chance to miss the worker-up messages since ekaf_server state change logic is separated with process of worker-up message.
First of all, I've changed the prepare process to an instant manner as there is a pick operation when producing sync messages on non-prepared topic.
Then I've added three trivial features, one for operation friendliness which can purge messages in case too many messages buffered in memory, one for fast recovery on kafka cluster restart or network problem which will timeout on connection, one bug fix on restart worker which will lead to twofold reconnection on each connection failure.
We've run this version in production environment for about one month, and I guess it's time to send them back. HTH.
Ekaf will store the prepare request to dict and would response it when got worker-up messages, in my production case with many Kafka partitions, it needs to wait so long for the worker-up messages that will reach the timeout and exit the caller process. And also, it have a little chance to miss the worker-up messages since ekaf_server state change logic is separated with process of worker-up message.
First of all, I've changed the prepare process to an instant manner as there is a pick operation when producing sync messages on non-prepared topic.
Then I've added three trivial features, one for operation friendliness which can purge messages in case too many messages buffered in memory, one for fast recovery on kafka cluster restart or network problem which will timeout on connection, one bug fix on restart worker which will lead to twofold reconnection on each connection failure.
We've run this version in production environment for about one month, and I guess it's time to send them back. HTH.
Re-pull from https://github.com/helpshift/ekaf/pull/13