Open pof-declaneaston opened 1 year ago
I tried running the pipeline with num_workers = 3, which worked well for a little while but eventually the runner scaled it back down to 1 worker and the system is unable to keepup.
I realized that the autoscaler can actually be disabled so I have a work around for the issue but definitely would prefer a cleaner solution.
What happened?
Hello,
I have built a DataFlow Python pipeline which I am using to consume from and produce to Kafka. When I run my pipeline I notice that the backlog metric in the DataFlow job UI stays at or near 0 seemingly forever. This is incorrect since I can see in my Kafka metrics that the consumer group's latency is constantly increasing, the DataFlow job is not keeping up with the consumer. Since the backlog is staying low the job never scales up and never catches up with the topic. I believe Java pipelines using KafkaIO correctly take consumer latency into account to track backlog.
My pipeline is running v2.44.0. I would provide something to repro but I don't know how you would access a Kafka cluster to repro with.
Thanks a lot, Declan
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components