Closed lud closed 2 years ago
When you run the test you need to use {queue_buffering_overflow_strategy, block_calling_process}
.
How erlkaf producer works is:
It buffers the messages in a certain timeframe (5 ms default) configured via queue_buffering_max_ms
up to the following limits :
queue_buffering_max_messages
) orqueue_buffering_max_kbytes
)And send them in batch for a better throughput.
In base the queue overflows it has 3 methods of handling the situation (queue_buffering_overflow_strategy
):
block_calling_process
- block the calling process till the message can be queued againlocal_disk_queue
(default) - queues the events on the local disk and flushes them when memory queue has space again. For example if the
broker goes down you don't loose the messages. Those are stored on the local disk and when broker comes online are flushed.drop_records
- messages are dropped if the queue is fullNote: The memory buffer queue is shared by all topics and partitions.
Kind regards, Silviu
Hi @silviucpp ,
Thank you for your answer.
If I understand well, queue_buffering_overflow_strategy
applies when the memory queue is full, and block_calling_process
makes the calling process wait until there is room in the queue.
But If I want the process to block until Kafka acknowledges the production of the message, that is, when the memory queue was flushed and actually produced, then there is no base support for that, and I should receive
the delivery report, right ?
No ! And In my opinion there is no real use case for what you want in the real world. Maybe I'm wrong. Doing this its very hard to scale.
No there is no support for that, or no I should not use receive? Both I guess :)
My use case is that I want to be sure that some event has been stored in Kafka before moving on because that event matters and the producing function will not be called again so it should crash if it cannot produce to Kafka.
What would you do in that case?
You can install the delivery callback for errors only and you will receive over there all the messages that failed to be sent. Once you receive such event you can store it somewhere else and resend it.
Otherwise you need to block your calling process using "receive" to wait for the delivery callback for that specific message. But don't expect high scalability with this kind of approach. You might better rethink your logic.
In theory as I told you if message cannot be pushed in kafka is stored in memory and when kafka comes online it's pushed. if memory gets full too fast they are stored on the disk.
Silviu
Thank you for the clarifications.
I guess I should just rely on erlkaf then. I did not know that it would keep the queue when disconnected. That's great.
Cheers!
Hi,
I found here that you are comparing
erlkaf:produce/4
tobrod:produce_sync/5
. But if I understand well, erlkaf does not produces synchronously, but rather return immediately and sends a delivery report back.Please correct me if that is wrong. If it is correct, is there a special support for blocking until the message is delivered ? Or should I just wait the delivery report?
Thank you.