I want to be able to partition a topic by unique value in one or more columns (this is commonly supported in Arrow/Parquet libraries).
Topics also need to support automatically partitioning by number of rows to avoid read performance degradation for very large datasets.
It would also be nice to be able to specify partitions by non-unique values such as:
Ranges (e.g. for an integer column specify partitions where max - min < 100
I want to be able to partition a topic by unique value in one or more columns (this is commonly supported in Arrow/Parquet libraries). Topics also need to support automatically partitioning by number of rows to avoid read performance degradation for very large datasets.
It would also be nice to be able to specify partitions by non-unique values such as:
max - min < 100