apache / amoro

Apache Amoro (incubating) is a Lakehouse management system built on open data lake formats.
https://amoro.apache.org/
Apache License 2.0
848 stars 278 forks source link

[Improvement]: Integrate multiple formats in a pluggable way #3116

Open baiyangtx opened 1 month ago

baiyangtx commented 1 month ago

Search before asking

What would you like to be improved?

Currently, Amoro includes multiple formats(iceberg, mixed-iceberg, and paimon) during compilation.

With the integration of more formats such as hudi, implementing the integration of different formats directly in the amoro-core module and the amoro-ams-server module will make the final distribution package more bloated.

This issue hopes to make the integration of the table format in a pluggable way.

This issue hopes to make the integration of table format in a pluggable way, which will help reduce the size of the binary package and also avoid the risk of introducing unnecessary code in the production environment.

How should we improve?

The finnally modules will look like this:

Related

Are you willing to submit PR?

Subtasks

Code of Conduct

baiyangtx commented 1 month ago

The full design docs

https://docs.google.com/document/d/1pUK1WpW1d1gvF5qTdFJwIseYlOxc02YkFw3QiGplNm4/edit