apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.24k stars 2.17k forks source link

Iceberg Spark Extensions conflict with Paimon #10143

Open wForget opened 5 months ago

wForget commented 5 months ago

Feature Request / Improvement

The Call syntax is defined in both Iceberg and Paimon, which may cause conflicts when I introduce their SparkSessionExtensions at the same time.

Reproduce:

spark.sql.extensions=org.apache.paimon.spark.extensions.PaimonSparkSessionExtensions,org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions

-- create iceberg table
CREATE TABLE iceberg_catalog.sample.iceberg_t1 (
    user_id BIGINT,
    item_id BIGINT,
    behavior STRING,
    dt STRING,
    hh STRING
) using iceberg;

-- create paimon table
CREATE TABLE paimon_catalog.sample.paimon_t1 (
    user_id BIGINT,
    item_id BIGINT,
    behavior STRING,
    dt STRING,
    hh STRING
) TBLPROPERTIES (
    'primary-key' = 'dt,hh,user_id'
);

-- Successed
CALL iceberg_catalog.system.remove_orphan_files(table => "sample.iceberg_t1");

-- Failed, use iceberg ResolveProcedures
CALL paimon_catalog.sys.remove_orphan_files(table => "sample.paimon_t1");

One idea: If currentCatalog is not Spark Session catalog and Iceberg Spark Catalog, we first use delegate parser to parse sqlText in IcebergSparkSqlExtensionsParser#parsePlan.

Query engine

Spark

ajantha-bhat commented 5 months ago

can you close this if it is a duplicate of https://github.com/apache/paimon/issues/3212 ?

Never mind. I think what you mean it can be fixed either by Iceberg or by paimon?

wForget commented 5 months ago

Never mind. I think what you mean it can be fixed either by Iceberg or by paimon?

Yes, I'm not sure if my idea is acceptable, so I submitted issues on both sides hoping to get more suggestions.