IBM / sarama

Sarama is a Go library for Apache Kafka.
MIT License
11.42k stars 1.75k forks source link

got signal: killed when produce massage with async producer #1681

Closed dannyferian closed 8 months ago

dannyferian commented 4 years ago
Versions
Sarama Kafka Go
v1.26.1 sarama.MaxVersion = V2_4_0_0 go version go1.13.8 darwin/amd64
Configuration

What configuration values are you using for Sarama and Kafka?

f, _ := os.OpenFile("tmp/kafka/logger.log", os.O_WRONLY|os.O_CREATE|os.O_APPEND, 0755)
sarama.Logger = log.New(f, "[sarama] ", log.LstdFlags)

kafkaConfig := sarama.NewConfig()
kafkaConfig.Version = sarama.MaxVersion
kafkaConfig.Producer.Return.Successes = true
kafkaConfig.Producer.Return.Errors = true
kafkaConfig.Consumer.Return.Errors = true
kafkaConfig.Net.WriteTimeout = 5 * time.Second

// idempotent producer
kafkaConfig.Producer.Idempotent = true
kafkaConfig.Producer.RequiredAcks = sarama.WaitForAll
kafkaConfig.Producer.Retry.Max = 0x7fffffff
kafkaConfig.Net.MaxOpenRequests = 1

if username != "" {
    kafkaConfig.Net.SASL.Enable = true
    kafkaConfig.Net.SASL.User = username
    kafkaConfig.Net.SASL.Password = password
}
Logs

When filing an issue please provide logs from Sarama and Kafka if at all possible. You can set sarama.Logger to a log.Logger to capture Sarama debug output.

logs: CLICK ME

``` [sarama] 2020/04/28 12:37:19 Initializing new client [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 client/metadata fetching metadata for all topics from broker 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 Connected to broker at 127.0.0.1:9092 (unregistered) [sarama] 2020/04/28 12:37:19 client/brokers registered new broker #0 at 192.168.100.143:9092 [sarama] 2020/04/28 12:37:19 Successfully initialized new client [sarama] 2020/04/28 12:37:19 Initializing new client [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 client/metadata fetching metadata for all topics from broker 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 Connected to broker at 127.0.0.1:9092 (unregistered) [sarama] 2020/04/28 12:37:19 client/brokers registered new broker #0 at 192.168.100.143:9092 [sarama] 2020/04/28 12:37:19 Successfully initialized new client [sarama] 2020/04/28 12:37:19 Initializing new client [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 client/metadata fetching metadata for all topics from broker 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 client/metadata fetching metadata for [ekyc.kafka.newlibrary.test] from broker 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 Connected to broker at 127.0.0.1:9092 (unregistered) [sarama] 2020/04/28 12:37:19 client/coordinator requesting coordinator for consumergroup ekyc-app-test from 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 client/brokers registered new broker #0 at 192.168.100.143:9092 [sarama] 2020/04/28 12:37:19 Successfully initialized new client [sarama] 2020/04/28 12:37:19 client/coordinator coordinator for consumergroup ekyc-app-test is #0 (192.168.100.143:9092) [sarama] 2020/04/28 12:37:19 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:19 Connected to broker at 192.168.100.143:9092 (registered as #0) [sarama] 2020/04/28 12:37:19 Obtained a ProducerId: 10031 and ProducerEpoch: 0 [sarama] 2020/04/28 12:37:19 client/coordinator requesting coordinator for consumergroup ekyc-app-test from 127.0.0.1:9092 [sarama] 2020/04/28 12:37:19 client/coordinator coordinator for consumergroup ekyc-app-test is #0 (192.168.100.143:9092) [sarama] 2020/04/28 12:37:19 consumer/broker/0 added subscription to ekyc.kafka.newlibrary.test/1 [sarama] 2020/04/28 12:37:19 consumer/broker/0 added subscription to ekyc.kafka.newlibrary.test/2 [sarama] 2020/04/28 12:37:19 consumer/broker/0 added subscription to ekyc.kafka.newlibrary.test/0 [sarama] 2020/04/28 12:37:41 ClientID is the default of 'sarama', you should consider setting it to something application-specific. [sarama] 2020/04/28 12:37:42 producer/broker/0 starting up [sarama] 2020/04/28 12:37:42 producer/broker/0 state change to [open] on ekyc.kafka.newlibrary.test/2 [sarama] 2020/04/28 12:37:42 Connected to broker at 192.168.100.143:9092 (registered as #0) [sarama] 2020/04/28 12:38:16 producer/broker/0 state change to [open] on ekyc.kafka.newlibrary.test/1 ECHO LOG 2020/04/28 12:37:19 Server listen at http://0.0.0.0:1323 ____ __ / __/___/ / ___ / _// __/ _ \/ _ \ /___/\__/_//_/\___/ v3.3.10-dev High performance, minimalist Go web framework https://echo.labstack.com ____________________________________O/_______ O\ ⇨ http server started on [::]:1323 INFO[0000] setting up consumer INFO[0000] Sarama consumer up and running!... INFO[0000] consumer claim INFO[0000] consumer claim INFO[0000] consumer claim {"time":"2020-04-28T12:37:36.338335+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":6069113,"latency_human":"6.069113ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:41.962055+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2891026,"latency_human":"2.891026ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:41.982568+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2794140,"latency_human":"2.79414ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.001989+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":3446202,"latency_human":"3.446202ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.105099+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":4283041,"latency_human":"4.283041ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.120218+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2530933,"latency_human":"2.530933ms","bytes_in":103,"bytes_out":5} INFO[0022] Message claimed: value = Hai, here's a new post for you! Title: Hallo semuanya Content: apanya yang copot? apa yaa? hayoo apa ayo?, timestamp = 2020-04-28 12:37:42.122 +0700 WIB, topic = ekyc.kafka.newlibrary.test INFO[0022] > message: "Hai, here's a new post for you! Title: Hallo semuanya Content: apanya yang copot? apa yaa? hayoo apa ayo?" sent to partition 2 at offset 49 {"time":"2020-04-28T12:37:42.145723+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":1717060,"latency_human":"1.71706ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.164368+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":4246022,"latency_human":"4.246022ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.186846+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":4090817,"latency_human":"4.090817ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.209035+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2121015,"latency_human":"2.121015ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.233071+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":3509679,"latency_human":"3.509679ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.249652+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":1640665,"latency_human":"1.640665ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.275954+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":4442718,"latency_human":"4.442718ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.301015+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":5803233,"latency_human":"5.803233ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.324847+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2051232,"latency_human":"2.051232ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.352326+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2336463,"latency_human":"2.336463ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.370808+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":3920863,"latency_human":"3.920863ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.400217+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2115734,"latency_human":"2.115734ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.425299+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":4211194,"latency_human":"4.211194ms","bytes_in":103,"bytes_out":5} {"time":"2020-04-28T12:37:42.443224+07:00","id":"","remote_ip":"127.0.0.1","host":"0.0.0.0:1323","method":"POST","uri":"/post","user_agent":"PostmanRuntime/7.24.1","status":200,"error":"","latency":2729672,"latency_human":"2.729672ms","bytes_in":103,"bytes_out":5} INFO[0057] > message: "Hai, here's a new post for you! Title: Hallo semuanya Content: apanya yang copot? apa yaa? hayoo apa ayo?" sent to partition 1 at offset 51 INFO[0057] Message claimed: value = Hai, here's a new post for you! Title: Hallo semuanya Content: apanya yang copot? apa yaa? hayoo apa ayo?, timestamp = 2020-04-28 12:38:16.987 +0700 WIB, topic = ekyc.kafka.newlibrary.test signal: killed ```

Problem Description

Background: I want to send a push notification to my user who subscribes to some Post Tag when someone is posting a blog with tags. In init, I start my user consumer and let the consumer wait for a message. And produce a message to a related topic when a new blog successfully stored in my DB. (with async producer)

Issue: I call my API many times, Sometimes, the producer does not produce all the message then, my application always receive a signal: killed from out of nowhere

dannyferian commented 4 years ago

I found the root cause. My application was crashed and it fixed when I remove the kafkaConfig.Producer.Retry.Max = 0x7fffffff. Can somebody tell me why this happened? And how much Producer.Retry.Max should be for the idempotent setting?

ghost commented 3 years ago

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the master branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.

hindessm commented 1 year ago

I suspect it might be getting killed due to out of memory. The async producer creates a slice for retry state of length Producer.Retry.Max + 1 which for your config would be around 3GB so this might be why.

github-actions[bot] commented 10 months ago

Thank you for taking the time to raise this issue. However, it has not had any activity on it in the past 90 days and will be closed in 30 days if no updates occur. Please check if the main branch has already resolved the issue since it was raised. If you believe the issue is still valid and you would like input from the maintainers then please comment to ask for it to be reviewed.