apache / druid

Apache Druid: a high performance real-time analytics database.
https://druid.apache.org/
Apache License 2.0
13.5k stars 3.7k forks source link

Amazon Kinesis Support Enhanced Fan-Out #15800

Open pauldheinrichs opened 9 months ago

pauldheinrichs commented 9 months ago

Description

Feel free to close this if there is a better place to put this or if I'm incorrect. But I feel like there are improvements that could be made to the kinesis ingestion to resolve the main "known issue" under the kinesis druid docs.

Before you deploy the Kinesis extension to production, consider the following known issues: Avoid implementing more than one Kinesis supervisor that reads from the same Kinesis stream for ingestion. Kinesis has a per-shard read throughput limit and having multiple supervisors on the same stream can reduce available read throughput for an individual supervisor's tasks.

What does Enhanced fan out do?

A consumer that uses enhanced fan-out doesn't have to contend with other consumers who are receiving data from the stream

Source Docs:

https://docs.aws.amazon.com/streams/latest/dev/building-enhanced-consumers-api.html


I have not dug into the kinesis consuming code enough to say what the lift would be to support this, but thought posting this for visibility might be worth while? :shrug:

yurmix commented 2 months ago

This will require solving https://github.com/apache/druid/issues/16903