StarRocks / starrocks

The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
https://starrocks.io
Apache License 2.0
9.12k stars 1.82k forks source link

Hudi catalog improvement #46975

Open gohalo opened 5 months ago

gohalo commented 5 months ago

Enhancement

This issue is used to trace hudi related optimization, including metadata and sink.

[ ] refactor hudi table implement. [ ] refactor current remote file io implement. [ ] support incremental queries. [ ] maintain hudi catalog without hive. [ ] continuously optimize performance. [ ] support insert. [ ] support metadata table

alberttwong commented 4 months ago

I would add

Specifically support the following type of queries: COW Snapshot Queries COW Incremental Queries COW Incremental Queries (CDC) COW Bootstrap Queries MOR Snapshot Queries MOR Read-Optimized Queries MOR Incremental Queries MOR Incremental Queries (CDC) MOR Bootstrap queries(RO) MOR Bootstrap queries(snapshot) Time travel