delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.62k stars 1.71k forks source link

[Spark] Fix a bug where MERGE command can fail to run with AnalysisException #3800

Open clee704 opened 4 weeks ago

clee704 commented 4 weeks ago

Which Delta project/connector is this regarding?

Description

Resolves #3099 by cloning the merge command before setting a tag.

This could be also solved on the Spark side, by fixing the relation cache logic to return a copy of the cached relation instead of returning the same instance, which can cause this issue. However, that can take some time, and it doesn't change the fact that current and old Spark versions have the issue.

How was this patch tested?

Added a test that fails without the fix with the error "org.apache.spark.sql.AnalysisException: Table does not support reads", and passes with the fix.

Does this PR introduce any user-facing changes?

It is unlikely a user will notice any change with this PR. The fix copies a query plan tree and this happens all the time in the analysis/optimization/planning phases of Spark.