Open Abacn opened 1 year ago
Under umbrella issue #22303
Kind ping. What is status of this issue?
@aromanenko-dev unfortunately I do not have a good idea for short term fix. Throttling detection is considered as a long term solution, that is the IO connector has can detect throttling and make runner aware, then runner can prevent scaling up and possibly scaling down.
@Abacn Yes, throttling detection is a general problem across all IO connectors and runners. So, should we close this one then?
What happened?
Found by Kafka performance test failing after #24879 in. This is because we removed shuffle=appliance there and then read from shuffle becomes faster. However, the producer is unable to digests data within timeout (can be seen by the flooding warning log of
send failed : 'Expiring 148 record(s) for beam-sdf-0:130675 ms has passed since batch creation'
). However, the message itself does not contain a timestamp and should be tolerant to throttling.The error happens because producer.send is asynchronous. It adds a system timestamp when .send gets called and returns immediately. We need a mechanism to prevent overwhelming Kafka producer.
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components