Open jzvelc opened 12 months ago
Any news on this one? We really need some kind of a workaround for first publish latencies as latencies of 1s are totally unacceptable. Image running hundreds of processes serving http requests. Some endpoints which publish kafka messages are called less frequently and are often handled by a different proceseses. This practically means we often hit into 1s publish latency. With frequent deployments processes are restarted which makes this even worse.
Any update on this issue (or workaround) we are getting hit this issue in production quite often because python wsgi processes are recreated and every time first message for each topic is really slow and negatively impacting the rest api latencies.
list_topics
uses metadata
API of librdkafka internally, which doesn't cache topic information if requesting all topics. I understand your concern but caching all the topics is not a good option as there might be alot of topics in the cluster.
Metadata
call for specific topics caches those topic information and hence it is readily available for producer to use. Can you call list_topics
with specific topic before calling the produce
?
Calling list_topics
with specific topic works as intended and caches the topic. However, I don't want to call this for every topic as there would be to many requests. I would expect calling a single list_topics(topic=None)
would cache metadata for all topics the same way as it is done in golang implementation.
Have you tried the latest version of golang client and it can you confirm that the latest version is caching all the topics? AFAIK it shouldn't.
Any news on this topic? Is there any way to solve this problem without sacrificing CPU?
The listTopics (e.g. partitionsFor) method doesn't seem to work anymore
Description
We have hit the same issue as described in https://github.com/confluentinc/confluent-kafka-dotnet/issues/701. First publish to a topics takes ~1s. Workaround with prefetching metadata works in golang but doesn't work in python.
producer_instance.list_topics(topic=None)
doesn't cache topic metadata. Interestingly explicitly specifying topic in list_topics call e.g.producer_instance.list_topics(topic='topic-to-produce-to')
works as expected and properly caches metadata which helps with first produce call. We would like to make a single list_topics call to prefetch metadata for all topics to avoid large number of requests to broker.How to reproduce
Delay occurs at:
DEBUG: WAKEUP [rdkafka#producer-1] [thrd:app]: 127.0.0.1:29093/1: Wake-up: flushing
Checklist
Please provide the following information:
confluent_kafka.version()
andconfluent_kafka.libversion()
): Tested with 2.2.0 and 2.3.0 e.g.:('2.3.0', 33751040) ('2.3.0', 33751295)
{'debug': 'all', 'bootstrap.servers': '127.0.0.1:29093'}
Example 2
2023-11-23 16:25:54,715 DEBUG: WAKEUP [rdkafka#producer-1] [thrd:app]: 127.0.0.1:29093/1: Wake-up: flushing 2023-11-23 16:25:54,715 DEBUG: TOPPAR [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: 127.0.0.1:29093/1: topic-to-produce-to [4] 1 message(s) in xmit queue (1 added from partition queue) 2023-11-23 16:25:54,715 DEBUG: WAKEUP [rdkafka#producer-1] [thrd:app]: Wake-up sent to 1 broker thread in state >= UP: flushing 2023-11-23 16:25:54,715 DEBUG: NEWPID [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: topic-to-produce-to [4] changed PID{Invalid} -> PID{Id:6047,Epoch:0} with base MsgId 1 2023-11-23 16:25:54,715 DEBUG: RESETSEQ [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: topic-to-produce-to [4] resetting epoch base seq from 0 to 1 2023-11-23 16:25:54,715 DEBUG: PRODUCE [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: 127.0.0.1:29093/1: topic-to-produce-to [4]: Produce MessageSet with 1 message(s) (579 bytes, ApiVersion 7, MsgVersion 2, MsgId 1, BaseSeq 0, PID{Id:6047,Epoch:0}, snappy) 2023-11-23 16:25:54,716 DEBUG: SEND [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: 127.0.0.1:29093/1: Sent ProduceRequest (v7, 645 bytes @ 0, CorrId 5) 2023-11-23 16:25:54,734 DEBUG: RECV [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: 127.0.0.1:29093/1: Received ProduceResponse (v7, 63 bytes, CorrId 5, rtt 25.46ms) 2023-11-23 16:25:54,734 DEBUG: MSGSET [rdkafka#producer-1] [thrd:127.0.0.1:29093/bootstrap]: 127.0.0.1:29093/1: topic-to-produce-to [4]: MessageSet with 1 message(s) (MsgId 1, BaseSeq 0) delivered 2023-11-23 16:25:54,734 INFO: Message users.UserRegistered[4e77ce7d-ccd2-4ede-9db4-f7aa289502b3] delivered to topic-to-produce-to[4]@0