doujiang24 / lua-resty-kafka

Lua kafka client driver for the Openresty based on the cosocket API
BSD 3-Clause "New" or "Revised" License
802 stars 275 forks source link

error_handle in async producer has duplicate items in queue #140

Open teejteej opened 2 years ago

teejteej commented 2 years ago

When using error_handle in async producer, it ends up having duplicate messages in queue over different callbacks. In example below, it generates a new UUID on every call.

When bringing down Kafka, it correctly calls the error_handle function. But the log shows the for subsequent error handle calls, that the queue object contains previous items that were already part of another error_handle call, in some cases it happens more than one time.

The end result is that individual items within the rescued queue then end up many times, sometimes 4x, downstrean in another recovery process.

Is this expected behavior? And if so, is there any way to find out what items in the queue have already been sent to the error handle, so they duplicate ones can be removed?

2022/06/10 21:51:37 [error] 17006#17006: *158 [lua] tracker.lua:51: Error handle: idx 4 partition_id 5 retryable true json [null,"{\"uuid\": \"uuid-1\"}",null,"{\"uuid\": \"uuid-2\"}"], context: ngx.timer, client: 192.168.0.1, server: 0.0.0.0:80

2022/06/10 21:51:59 [error] 17006#17006: *260 [lua] tracker.lua:51: Error handle: idx 2 partition_id 5 retryable true json [null,"{\"uuid\": \"uuid-1\"}"], context: ngx.timer, client: 192.168.0.1, server: 0.0.0.0:80

2022/06/10 21:52:01 [error] 17006#17006: *271 [lua] tracker.lua:51: Error handle: idx 4 partition_id 5 retryable true json [null,"{\"uuid\": \"uuid-2\"}",null,"{\"uuid\": \"uuid-5\"}"], context: ngx.timer, client: 192.168.0.1, server: 0.0.0.0:80
function producer_error_handle = function(topic, partition_id, queue, index, err, retryable)
    ngx.log(ngx.ERR, "Error handle: index ", index, ' partition_id ', partition_id, ' retryable ', retryable, ' json ', json.encode(queue))
end

local config = {
    --- for client_config
    socket_timeout = 3000, -- ms; should be larger than request_timeout
    keepalive_timeout = 15 * 1000, -- ms
    keepalive_size = 3,
    refresh_interval = 15000, -- ms

    --- for producer_config
    producer_type = "async",
    request_timeout = 2000, -- ms
    required_acks = 1,
    max_retry = 3,
    retry_backoff = 100, -- ms
    error_handle = producer_error_handle,

    -- Async buffer config (only for async producer):
    flush_time = 1000, -- ms
    batch_num = 200, -- max messages in batch
    batch_size = 1048576,
    max_buffering = 50000 -- maximum messages to buffer
}

local kp = producer:new(broker_list, config)
local ok, err = kp:send("events", Nil, json.encode({test_id = uuid()}))