ClickHouse / spark-clickhouse-connector

Spark ClickHouse Connector build on DataSourceV2 API
https://clickhouse.com/docs/en/integrations/apache-spark
Apache License 2.0
188 stars 66 forks source link

Replacing table that created by on cluster statement will failed #61

Closed sketchmind closed 2 years ago

sketchmind commented 2 years ago

Using createOrReplace method in Spark DataSourceV2, when the table created on cluster, replace will failed with 'Table xxx(other node of cluster) already exists'.

pan3793 commented 2 years ago

Which your clickhouse-server version? Are you using Atomic Database Engine?

pan3793 commented 2 years ago

https://kb.altinity.com/engines/altinity-kb-atomic-database-engine/

sketchmind commented 2 years ago

Which your clickhouse-server version? Are you using Atomic Database Engine?

Version is 21.3.7.62 (official build) and it's the engine

pan3793 commented 2 years ago

It's because the async DROP TABLE does not take effect immediately, then breaks createOrReplace. I think it's good to add a configuration spark.clickhouse.drop.table.sync and set it to true in default, because atomic is the default database engine in recent ClickHouse releases.

sketchmind commented 2 years ago

I set this property but it didn't solve the problem,I'm guessing the problem is caused by not using the on cluster statement when deleting the table during the replacement process image

pan3793 commented 2 years ago

You are right, the project does not handle all cases for clusterClause properly.