trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.49k stars 3.02k forks source link

Refactor merge to support partial update #24075

Open chenjian2664 opened 2 weeks ago

chenjian2664 commented 2 weeks ago

Description

Currently the update implementation of MERGE update updates the whole row, this pr is going to update the engine merge impl to support partial update. The main idea is to keep the merge case_number of the UpdateCase in the output of the MergeProcessNode,(this value already kept in the merge_row symbol that in the sourceNode of the MergeProcessNode, so only needs to project it), and then in the MergeWriterNode, the updateSink can perform the update according to the update case.

The partial update also needs to know the mapping of the case number with the relevant columns, thus the ConnectorMetadata#beginMerge is refactored to hold the info.

Here use the Phoenix connector as a example to show how it works.

Additional context and related issues

Prerequisites for https://github.com/trinodb/trino/pull/23034

Release notes

( ) This is not user-visible or is docs only, and no release notes are required. (x) Release notes are required. Please propose a release note for me. () Release notes are required, with the following suggested text:

## Section
* Support partial update in engine.
* Support partial update in Phoenix.
chenjian2664 commented 1 week ago

Flaky https://github.com/trinodb/trino/actions/runs/11821799275/job/32937454502?pr=24075#step:5:2219

chenjian2664 commented 1 week ago

https://github.com/trinodb/trino/actions/runs/11829608194/job/32961756024?pr=24075#step:5:2343

chenjian2664 commented 1 week ago

https://github.com/trinodb/trino/actions/runs/11829608194/job/32961756024?pr=24075#step:5:2343

https://github.com/trinodb/trino/issues/24131