apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
861 stars 285 forks source link

[Feature]: Support aggregation keys for mixed-format tables #1359

Open zhoujinsong opened 1 year ago

zhoujinsong commented 1 year ago

Description

Data warehouses like Doris provide the ability to configure aggregation keys for tables. Data written to this field will be automatically aggregated before being written to the data file. This pre-computation capability greatly improves the analytical performance of the table.

Arctic's Mixed-format design opens up rules for data merging, and we can extend an aggregation merge rule to handle such use cases.

Use case/motivation

Support defining aggregation fields on tables, data written to this field will be automatically pre-aggregated according to the aggregation function to improve query performance.

Describe the solution

Extend the merge rules for data merging and add aggregation merge rules.

Subtasks

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

github-actions[bot] commented 2 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.