Open fanaticjo opened 3 years ago
This most likely has to do with the specifics of your query vs. MERGE
. Can you provide additional information so its possible to recreate this?
Could share us your MERGE query so that we can see what is the merge condition?
The table is partitioned by date S3 list s3://bucketname/delta/tablename/part_key=2020-12-01 s3://bucketname/delta/tablename/part_key=2020-12-02
Query for merge Case 1 deltaTable.alias("old") \ .merge( df.alias("new"), f"old.part_key in ('2020-12-01') and old.id = new.id") \ .whenMatchedUpdateAll() \ .whenNotMatchedInsertAll().execute() Didnt work
Case 2 spark.sql(f"""merge into source using new on source in ('2020-12-01') and source.id=new.id when matched then update set when not matched then insert """)
Didnt work
Case 3 spark.sql(f"""merge into source using new on source in (to_date('2020-12-01','yyyy-MM-dd')) and source.id=new.id when matched then update set when not matched then insert """)
This most likely has to do with the specifics of your query vs.
MERGE
. Can you provide additional information so its possible to recreate this?
any updates ?
Thanks for your patience @fanaticjo - looking at the above queries, there isn't anything obvious. By any chance can you provide a dataset so we can recreate this? By any chance can you also test using Spark 3.1.x / Delta Lake 1.0.x to determine if the behavior is any different?
Delta Lake merge command partition pruning is not working in the explain plan i dont see any partition filters being applied and its being used as filters
But when i am doing a simple delta read i am able to see the partition filters
Using delta lake version 8 in spark 3 emr version 6.2.0
Please help