apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.27k stars 911 forks source link

[Feature] support alluxio filesystem in paimon #802

Closed qics1982 closed 1 year ago

qics1982 commented 1 year ago

Search before asking

Motivation

alluxio is a data orchestration technology(https://docs.alluxio.io/os/user/stable/en/Overview.html), we mainly use alluxio to speet up hot (hive/iceberg) tables which are frequenty accessed for lookup join or ad-hot query via trino.

similar to hdfs, we can create and use paimon catalogs in alluxio,

alluxio releted configurations can be configured in hdfs core-site.xml or in catalog propreties

create catalog in flink: CREATE CATALOG my_catalog WITH ( 'type' = 'paimon', 'warehouse' = 'alluxio://path/to/warehouse' ); USE CATALOG my_catalog;

then wen can create/write to paimon tables in Flink/Spark, and read paimon tables in Flink/Spark/Trino side

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

zhuangchong commented 1 year ago

Related issue: #797