StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.06k stars 1.82k forks source link

Support iceberg table compaction #48736

Open rohankrao opened 4 months ago

rohankrao commented 4 months ago

Starrocks supports writing into iceberg tables. It will be nice if starrocks supports iceberg table housekeeping.

Feature request

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Describe alternatives you've considered

Additional context

alvin-celerdata commented 4 months ago

@rohankrao Thanks for this suggestion, I wonder whether or not you have put starrocks over Iceberg in production?

rohankrao commented 4 months ago

No, not in production. This feature will help in production.

On Tue, 23 Jul, 2024, 2:45 am alvin, @.***> wrote:

@rohankrao https://github.com/rohankrao Thanks for this suggestion, I wonder whether or not you have put starrocks over Iceberg in production?

— Reply to this email directly, view it on GitHub https://github.com/StarRocks/starrocks/issues/48736#issuecomment-2243826958, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADXIE2BD3EUTCSTPZO4KMRTZNVY6TAVCNFSM6AAAAABLIX3GOWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDENBTHAZDMOJVHA . You are receiving this because you were mentioned.Message ID: @.***>

Dshadowzh commented 4 months ago

@rohankrao Which do you want, a fully managed compaction service or just a compaction interface you can integrated into your schedule system?

rohankrao commented 4 months ago

I will prefer a manual compaction command like you have for native tables. I am writing into iceberg from kafka externally and can trigger compaction when needed. Instead of using spark to do housekeeping, I want to use SR for that.

nqvuong1998 commented 2 months ago

It would be beneficial if StarRocks supported Iceberg table maintenance features such as optimization, expiring snapshots, and removing orphan files.