apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.35k stars 928 forks source link

[Feature] Support Python API for Paimon #2710

Open Pandas886 opened 8 months ago

Pandas886 commented 8 months ago

Search before asking

Motivation

Hope to implement a set of python api to support reading and writing paimon tables like iceberg.

Solution

No response

Anything else?

More convenient for ai engineers

Are you willing to submit a PR?

Waterkin commented 8 months ago

Good idea. Waiting for some responses.

coolderli commented 8 months ago

Looking forward to it.

zeddit commented 8 months ago

Looking forward to it. A reference is https://github.com/apache/iceberg-python . besides, could paimon stores and loads data in order. e.g in time order or write order when ingesting data.

Xuanwo commented 3 months ago

I believe that paimon-python could be built on top of paimon-rust, as discussed in https://github.com/apache/paimon/issues/3674. By sharing the same core, we can avoid implementing the same specifications twice.