delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.62k stars 1.71k forks source link

[Feature Request] Subqueries are not supported in the DELETE #2602

Open Mourya1319 opened 9 months ago

Mourya1319 commented 9 months ago

Feature request

Which Delta project/connector is this regarding?

Overview

I want to create a MOR Delta Table. The main feature of MOR tables is deletion_vector files. And they are generated only when we explicitly delete rows using DELETE FROM query. I use MERGE INTO query to merge the incrementals into my base table. It also includes WHEN NOT MATCHED THEN DELETE. But it doesn't generate deletion_vector files as it is not a direct DELETE FROM query. So, I tried using DELETE FROM WHERE <> , but got an error saying: [DELTA_UNSUPPORTED_SUBQUERY] Subqueries are not supported in the DELETE.

Motivation

So, people who want to build a MOR Delta Table will benefit from this. Because if a dataset is large enough and we generally include the DELETE in the MERGE INTO query itself. If that doesn't generate deletion_vector files then it is difficult to use DELETE FROM as we will always have a sub-query for this.

Request to add this functionality in upcoming releases. Thanks!

bambam229 commented 2 months ago

Looking for this same feature in Delta as well when trying to do more complex DELETE statements.