apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
847 stars 278 forks source link

[Feature]: Enables user-selected default table properties to persist into underlying tables #3121

Open nicochen opened 1 month ago

nicochen commented 1 month ago

Description

Currently, amoro enables default table properties under a catalog by merging them with underlying table's properties when loading a table. Sometime it requires to writte some defalut table properties into table matadata to adapt to more user cases.

Use case/motivation

Consider a scenario when using mix format, users set up default table properties 'log-store.address = xxxx' in a catalog . In this way, users do not have to indicate it repeatly and explicitly. Whereas, when needs more than one log-store clusters, modifying log-store.address to a new one would corrupt old tables log-store metadata.

Another case is an administrator like to set up a default compression codec for managed iceberg tables. But he cannot setup a default table properties as a platform-level property to achieve it. Since iceberg itself provides default compression codec and persisit to tables, merging catalog default properties cannot overwritten exists keys loading from iceberg table.

Scenarios above shows that sometimes it might need to persist some properties to table according to user ideas.

Describe the solution

It is necessary to provide User-Selected parameters or configuration to indicate what properties to be written to tables or just merging on load.

Subtasks

No response

Related issues

No response

Are you willing to submit a PR?

Code of Conduct

nicochen commented 1 month ago

@Aireed @zhoujinsong Any ideas on this topic?

Aireed commented 1 month ago

+1 We should provide some means to write parameters that cannot be changed once determined into the table's properties. We also need to differentiate the scope of these configuration parameters, for example, ensuring they only apply to newly created tables.