apache / paimon

Apache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
https://paimon.apache.org/
Apache License 2.0
2.41k stars 945 forks source link

[Feature] SparkGenericCatalog can only use the hive warehouse dir, and cannot use the paimon warehouse dir to isolate it from the hive warehouse dir #2161

Open 18216499322 opened 1 year ago

18216499322 commented 1 year ago

Search before asking

Motivation

When creating Hive Catalog in Paimon, Flink can specify the Paimon-specific Warehouse Dir, but the Hive Catalog created by SparkCatalog in SparkGenericCatalog cannot specify the Paimon-specific Warehourse Dir. The functions of the two should be consistent to provide better user experience.

Solution

Add the spark.sql.paimon.warehourse.dir configuration option to support SparkGenericCatalog configuration Paimon Warehouese Dir

Anything else?

No response

Are you willing to submit a PR?

xiuzhu9527 commented 11 months ago

When you use SparkGenericCatalog, should set spark.sql.catalog.spark_catalog.warehouse