trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
10.41k stars 3k forks source link

Implement SQL DELETE and UPDATE using MERGE machinery #13621

Closed djsstarburst closed 1 year ago

djsstarburst commented 2 years ago

Implementing SQL DELETE and UPDATE using MERGE machinery has lots of benefits:

The strategy is simple: The statement analyzer code for DELETE and UPDATE will be almost unchanged. The query planner will continue to plan DELETE and UPDATE separately from MERGE. However, the DELETE and UPDATE plans will create pages that can be passed to MergeRowProcessor.transformPage.

hashhar commented 2 years ago

How is this expected to work for connectors that pass-through to some other system like JDBC based ones? Is MERGE supposed to be translated into separate queries ran within a single transaction (assuming transactions are supported and provide linearizable transactions?).

findepi commented 2 years ago

How is this expected to work for connectors that pass-through to some other system like JDBC based ones? I

The io.trino.spi.connector.ConnectorMetadata#applyDelete should work as it does today.

djsstarburst commented 2 years ago

It appears to me that for DELETE, there isn't a lot of point in going through the MergeProcessor - - instead ISTM that DELETE should target the engine's MergeWriter node. The input page for MergeProcessor.transformPage() is more complicated than the page for MergeWriter, and the complicated parts are useless if all the rows in the page should be deleted.

djsstarburst commented 2 years ago

I have a WIP implementation of DELETE using MergeWriterNode on this branch. Most of the delete tests pass. Here are the bugs shown by the tests:

djsstarburst commented 2 years ago

Of the total of 24 tests DELETE tests in BaseIcebergConnectorTest, which inherits the DELETE tests from `BaseConnectorTest, 17 pass and 7 fail. Here are the failures:

Bottom line:

djsstarburst commented 2 years ago

The explanation for the testOptimizedMetadataQueries failure is also wrapped up in metadata delete. The test is setting property optimize_metadata_queries which messes up merge-based delete, because of this in MetadataQueryOptimizer:

    @Override
    public PlanNode optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, TableStatsProvider tableStatsProvider)
    {
        if (!SystemSessionProperties.isOptimizeMetadataQueries(session)) {
            return plan;
        }
        return SimplePlanRewriter.rewriteWith(new Optimizer(session, plannerContext, idAllocator), plan, null);
    }

So apart from changing the behavior when no rows are deleted, all the other problems will be fixed by new/revised optimizers.

findepi commented 2 years ago

cc @alexjo2144

djsstarburst commented 2 years ago

The problem of missing TableScanNodes was solved by the same fix as with MERGE itself - - the commit from #13838 that doesn't throw and exception in rewriteModifyTableScan when recreating the MergeWriterNode if it doesn't find the TableScanNode.

djsstarburst commented 2 years ago

I added optimizer PushMergeWriterDeleteIntoConnector whose PATTERN matches TableFinish -> MergeWriter -> Project -> TableScan. If it matches and ConnectorMetadata.applyDelete() returns the table scan. This change allows all of the Iceberg metadata delete tests to pass.

However, there is one failing Iceberg test: testDeleteWithSubquery. I'll look into that next.

djsstarburst commented 2 years ago

testDeleteWithSubquery is failing because optimizer InlineProjections is scrambling the page block order, and MergeWriterOperator.addInput() is expecting a specific block order. Apparently, that's a bug.

So MergeWriterOperator must get a new constructor arg that specifies the channel mapping.

I chatted with @martint, who said that I should expect the blocks to be scrambled, and should use LocalExecutionPlanner's mapping of column numbers to channels.

djsstarburst commented 2 years ago

This WIP branch implements both DELETE and UPDATE using the MergeProcessor/MergeWriter infrastructure, conditional on static booleans for now until the PR is nearer to being merged. I've run the existing DELETE and UPDATE tests for both Hive and Iceberg. Of course, the existing UPDATE tests do not test changing partition or bucket columns.

Here are the results from my new branch:

In Hive, three delete tests are failing. testDeleteFailsInExplicitTrinoTransaction is failing with the same non-matching thrown exception message as testUpdateFailsInExplicitTrinoTransaction, and the fix is the same. testDeleteWithOriginalFiles and testDeleteWithOriginalFilesWithWhereClause are failing with the error message below. I have yet to look into this problem.

Unsupported local exchange partitioning MERGE [insert = hive:HivePartitioningHandle{buckets=1, hiveTypes=[]}]

Here is the task list:

We also need to decide on the details of the rollout process, and whether the new implementations of DELETE and UPDATE are used by default.

findepi commented 2 years ago
  • In Hive, all UPDATE tests pass except one, testUpdateFailsInExplicitTrinoTransaction. It is failing because of a non-matching thrown exception message. To get the message right, HiveMetadata.beginInsertOrMerge needs an error message argument.

Or the expected message in the test be updated.

djsstarburst commented 2 years ago

Fix the non-matching thrown exception message problem.

A couple of Hive tests are failing because the exception message is wrong, e.g., it says “Inserting into…” instead of “Updating…“, etc. Apart from the specific test failures, it makes sense to me for the engine to convey the original operation to the connector.

The minimal fix is to add an enum arg to ConnectorMetadata.beginMerge() to say if the original operation is a DELETE, UPDATE or MERGE. An alternative would be a new API, maybe (recalling that I suck at names) ConnectorMetadata.beginModifyOperation(). For now I'll add the enum to beginMerge to make progress.

Add the machinery to get the channel mapping from LocalExecutionPlanner to fix testDeleteWithSubquery.

I added that machinery, but it didn't fix testDeleteWithSubquery

Investigate the two Hive testDeleteWithOriginalFiles DELETE failures.

It looks like these failures are happening because MERGE is assuming that the merge target is bucketed with one bucket, which is false for original files.

martint commented 2 years ago

I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message. We could update the error message and make it a little more generic ("Modifying the contents of table ..."), or just leave it as "Updating ..." as a general way of describing an operation that updates the contents of a table (INSERT, DELETE, UPDATE and MERGE generically "update" a table to have different content)

djsstarburst commented 2 years ago

I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.

Or the expected message in the test be updated.

That's fine with me. I like "Updating ...".

djsstarburst commented 2 years ago

I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.

Or the expected message in the test be updated.

Having looked at the CI build, I'm wondering if this is the right call. There are many test failures, all but a few due to the error message change, e.g. "merge" vs "update" or "delete". And most are due to the exceptions thrown in ConnectorMetadata default methods.

Right now, the default implementations of ConnectorMetadata.get(Delete|Update|Merge)RowId() and ConnectorMetadata.begin(Delete|Update|Merge) throw an exception saying This connector does not support (deletes|updates|merges). But the methods for delete and update are going away.

Clearly the tests can be changed and don't really matter. But the larger point is that I think there is genuine value add it providing the user with an error message that is not confusing. If the user ran a delete, it is confusing to tell the user the merge operation failed.

findepi commented 2 years ago

Right now, the default implementations of ConnectorMetadata.get(Delete|Update|Merge)RowId() and ConnectorMetadata.begin(Delete|Update|Merge) throw an exception saying This connector does not support (deletes|updates|merges). But the methods for delete and update are going away.

What's the SPI backward compatibility story for this change?

djsstarburst commented 2 years ago

What's the SPI backward compatibility story for this change?

@electrum theorizes that we don't need backward compatibility because all connectors that support row-by-row delete and update now support merge: Hive, Raptor, Iceberg, Delta Lake and Kudu.

hashhar commented 2 years ago

What about connectors that others have written and maintain outside the Trino tree? IMO even if the SPI is unused in Trino it should stay deprecated for some releases before being removed.

And if you still want to remove it I'd argue it needs to come with useful guide on how to migrate to new SPI - because to me it doesn't seem straightforward.

djsstarburst commented 2 years ago

I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.

Or the expected message in the test be updated.

Having looked at the CI build, I'm wondering if this is the right call.

I thought about it a bit more, and decided you guys are right, because everywhere the customer can see the error they can also see the SQL. I pushed a commit that changed the MERGE-related ConnectorMetadata methods to throw an exception with this message:

This connector does not support modifying table rows

and changed the tests that expect to fail doing DELETE, UPDATE or MERGE to expect that message.

djsstarburst commented 2 years ago

I've been whittling away at the errors turned up by the CI build. There was a serious bug, now fixed, in the calculated order of SET expressions what planning UPDATE. I add a test that showed the MERGE did not have the same bug.

I'm down to the point where it makes sense to list specific test failures:

These methods of the fault-tolerant tests are failing because the partition argument of the SQLTask fragment is ill-formed, containing null for the key column. This causes an exception to be raised before it gets to the injected exception:

io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testDelete
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testDeleteWithSubquery
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testUpdate
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testUpdateWithSubquery
findepi commented 2 years ago

There was a serious bug, now fixed, in the calculated order of SET expressions what planning UPDATE. I add a test that showed the MERGE did not have the same bug.

What is the bug? if it affects current release, can you please file an issue?

djsstarburst commented 2 years ago

Here is the complete enumeration of the current test failures in the CI build. They haven't been analyzed yet. Almost all are delete failures, most failing because the connector does not support getMergeRowIdColumnHandle. Yet.

./25_test (plugintrino-oracle).txt:                     Tests run: 482, Failures: 11, Errors: 0, Skipped: 60
./test (plugintrino-mariadb)/8_Maven Tests.txt:         Tests run: 754, Failures: 8, Errors: 0, Skipped: 116
./11_test (plugintrino-delta-lake).txt:                 Tests run: 1130, Failures: 15, Errors: 0, Skipped: 125
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt:      Tests run: 38, Failures: 6, Errors: 0, Skipped: 0
./test (plugintrino-postgresql)/8_Maven Tests.txt:      Tests run: 522, Failures: 6, Errors: 0, Skipped: 57
./test (plugintrino-phoenix5)/8_Maven Tests.txt:        Tests run: 394, Failures: 7, Errors: 0, Skipped: 93
./28_test (plugintrino-iceberg).txt:                    Tests run: 1871, Failures: 10, Errors: 0, Skipped: 11
./test (coretrino-main)/8_Maven Tests.txt:              Tests run: 383, Failures: 0, Errors: 3, Skipped: 0
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt: Tests run: 38, Failures: 6, Errors: 0, Skipped: 0
./test (plugintrino-cassandra)/8_Maven Tests.txt:       Tests run: 497, Failures: 12, Errors: 0, Skipped: 109
./30_test (clienttrino-jdbc,plugintrino-base-jdbc,plugintrino-.txt: Tests run: 515, Failures: 9, Errors: 0, Skipped: 62
./test (plugintrino-sqlserver)/8_Maven Tests.txt:       Tests run: 533, Failures: 8, Errors: 0, Skipped: 60
./20_test (plugintrino-mysql).txt:                      Tests run: 889, Failures: 10, Errors: 0, Skipped: 148
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-delta)/8_Maven Tests.txt: Tests run: 19, Failures: 4, Errors: 0, Skipped: 0
./test (plugintrino-mysql)/8_Maven Tests.txt:           Tests run: 889, Failures: 10, Errors: 0, Skipped: 148
./test (clienttrino-jdbc,plugintrino-base-jdbc,plugintrino-thrift,plugintrino-memory)/8_Maven Tests.txt: Tests run: 515, Failures: 9, Errors: 0, Skipped: 62
./test (plugintrino-delta-lake)/8_Maven Tests.txt:      Tests run: 1130, Failures: 15, Errors: 0, Skipped: 125
./22_test (plugintrino-sqlserver).txt:                  Tests run: 533, Failures: 8, Errors: 0, Skipped: 60
./9_test (plugintrino-cassandra).txt:                   Tests run: 497, Failures: 12, Errors: 0, Skipped: 109
./7_test (plugintrino-raptor-legacy).txt:               Tests run: 1134, Failures: 9, Errors: 0, Skipped: 93
./5_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt: Tests run: 19, Failures: 4, Errors: 0, Skipped: 0
./26_test (plugintrino-kudu).txt:                       Tests run: 488, Failures: 30, Errors: 0, Skipped: 84
./24_test (plugintrino-mariadb).txt:                    Tests run: 754, Failures: 8, Errors: 0, Skipped: 116
./29_test (plugintrino-phoenix5).txt:                   Tests run: 394, Failures: 7, Errors: 0, Skipped: 93
./test (plugintrino-raptor-legacy)/8_Maven Tests.txt:   Tests run: 1134, Failures: 9, Errors: 0, Skipped: 93
./21_test (plugintrino-postgresql).txt:                 Tests run: 522, Failures: 6, Errors: 0, Skipped: 57
./test (plugintrino-iceberg)/8_Maven Tests.txt: Q       Tests run: 1871, Failures: 10, Errors: 0, Skipped: 11
./1_test (coretrino-main).txt: Q                        Tests run: 383, Failures: 0, Errors: 3, Skipped: 0
./test (plugintrino-kudu)/8_Maven Tests.txt: Q          Tests run: 488, Failures: 30, Errors: 0, Skipped: 84
./test (plugintrino-oracle)/8_Maven Tests.txt: Q        Tests run: 482, Failures: 11, Errors: 0, Skipped: 60
djsstarburst commented 2 years ago

What is the bug? if it affects current release, can you please file an issue?

I should have been clearer - - the "serious bug" was in this PR. There is no similar bug in the tip master UPDATE or MERGE. (Though I added another test to this PR to prove to myself that MERGE was fine.)

djsstarburst commented 2 years ago

I've done a bit more analysis of the test failures above. Here is the summary:

djsstarburst commented 2 years ago

Tasks:

djsstarburst commented 2 years ago

After defining DefaultJdbcMetadata.getMergeRowIdColumnHandle, there is only one failure left in all the SQL database test suites, in BaseConnectorTest.testDelete. The test fails because it calls beginMerge running this delete, which deletes no rows. I'm guessing the delete was supposed to be pushed down to the connector but the optimization failed:

assertUpdate("DELETE FROM " + table.getName() + " WHERE orderkey > 5 AND orderkey < 4", 0);

The plan is helpfully logged:

Output[columnNames = [rows]]
│   Layout: [rows:bigint]
│   Estimates: 
└─ TableCommit[target = mysql:tpch.test_delete_hzholi6dug tpch.test_delete_hzholi6dug]
   │   Layout: [rows:bigint]
   │   Estimates: 
   └─ LocalExchange[partitioning = SINGLE]
      │   Layout: [partialrows:bigint, fragment:varbinary]
      │   Estimates: 
      └─ RemoteExchange[type = GATHER]
         │   Layout: [partialrows:bigint, fragment:varbinary]
         │   Estimates: 
         └─ MergeWriter[table = mysql:tpch.test_delete_hzholi6dug tpch.test_delete_hzholi6dug]
            │   Layout: [partialrows:bigint, fragment:varbinary]
            │   Estimates: 
            └─ LocalExchange[partitioning = SINGLE]
               │   Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
               │   Estimates: 
               └─ RemoteExchange[type = REPARTITION]
                  │   Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
                  │   Estimates: 
                  └─ Project[]
                     │   Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
                     │   Estimates: 
                     │   orderkey := null
                     │   custkey := null
                     │   orderstatus := null
                     │   totalprice := null
                     │   orderdate := null
                     │   orderpriority := null
                     │   clerk := null
                     │   shippriority := null
                     │   comment := null
                     │   operation := (CASE WHEN (("orderkey" > BIGINT '5') AND ("orderkey" < BIGINT '4')) THEN TINYINT '2' ELSE TINYINT '0' END)
                     │   insert_from_update := TINYINT '0'
                     └─ LocalExchange[partitioning = ROUND_ROBIN]
                        │   Layout: [orderkey:bigint, field:bigint]
                        │   Estimates: 
                        └─ Values[]
                               Layout: [orderkey:bigint, field:bigint]
                               Estimates: 
djsstarburst commented 2 years ago

@martint and I looked at the failure in BaseConnectorTest.testDelete, and concluded that we needed an optimizer analogous to RemoveEmptyDeleteRuleSet for MergeWriter.

We also looked at the failure of BaseConnectorTest.testDeleteWithSubquery, and concluded that the problem is in LocalExecutionPlanner.visitMergeWriter, because that code is assuming that the source layout and the node's output layout are the same. The two layouts are different.

djsstarburst commented 2 years ago

I pushed a commit that implements RemoveEmptyDeleteRuleSet for MergeWriter and it fixes the problem shown by BaseConnectorTest.testDelete.

I pushed a second commit that fixes BaseConnectorTest.testDeleteWithSubquery. In the previous code, the pagePreProcessor created by LocalExecutionPlanner.visitMergeWriter was a NOOP, because source outputs were used for both layouts. The one-line change uses MergeWriterNode.projectedSymbols as the expected layout.

djsstarburst commented 2 years ago

With these commits the failure count is reduced enough that it becomes enumerable. There are 39 test failures, of which 20 are Kudu, which is totally broken for DELETE and UPDATE. Most of the rest of the failures are related to apparently illegal partition schemes. Kudu is the only CHANGE_ONLY_UPDATED_COLUMNS connector, and that may have something to do with those failures.

Here are the test failures, with brief explanations:

// These fault-tolerant tests are failing because a different exception got raised due to illegal
// PartitionSchemas with message "Bucket to partition must be set before a partition function can be created"

./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79492-2022-09-08T21:41:35.0768019Z [ERROR]   TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:101 
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79494-2022-09-08T21:41:35.0768932Z [ERROR]   TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testUpdate:187 
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79496-2022-09-08T21:41:35.0770606Z [ERROR]   TestIcebergTaskFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:74->BaseIcebergFailureRecoveryTest.lambda$testDelete$0:74 
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80504-2022-09-08T21:41:35.0768022Z [ERROR]   TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:101 
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80506-2022-09-08T21:41:35.0768935Z [ERROR]   TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testUpdate:187 
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80508-2022-09-08T21:41:35.0770611Z [ERROR]   TestIcebergTaskFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:74->BaseIcebergFailureRecoveryTest.lambda$testDelete$0:74 

// Getting the wrong error message
./test (plugintrino-iceberg)/8_Maven Tests.txt-73311-2022-09-08T21:36:51.0857188Z [ERROR]   TestIcebergAvroConnectorTest>BaseIcebergConnectorTest.testModifyingOldSnapshotIsNotPossible:5265 
./28_test (plugintrino-iceberg).txt-74322-2022-09-08T21:36:51.0857200Z [ERROR]   TestIcebergAvroConnectorTest>BaseIcebergConnectorTest.testModifyingOldSnapshotIsNotPossible:5265 

// PartitionSchemas with message "Bucket to partition must be set before a partition function can be created"
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-delta)/8_Maven Tests.txt-24288-2022-09-08T21:29:50.3089659Z [ERROR]   TestDeltaTaskFailureRecoveryTest>BaseDeltaFailureRecoveryTest.testDelete:62->BaseDeltaFailureRecoveryTest.lambda$testDelete$0:62 
./5_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-25299-2022-09-08T21:29:50.3089689Z [ERROR]   TestDeltaTaskFailureRecoveryTest>BaseDeltaFailureRecoveryTest.testDelete:62->BaseDeltaFailureRecoveryTest.lambda$testDelete$0:62 

// It's expecting a specific set of data files.  I'm not sure that's reasonable
./test (plugintrino-delta-lake)/8_Maven Tests.txt-24576-2022-09-08T21:36:40.8853378Z [ERROR]   TestDeltaLakeConnectorSmokeTest>BaseDeltaLakeConnectorSmokeTest.testVacuum:1444 expected:<...w4fcue7sc/regionkey=[4/20220908_211930_00531_s7vxm-a3d7c348-1d06-416d-85ec-61cdb2032f70",
./11_test (plugintrino-delta-lake).txt-25587-2022-09-08T21:36:40.8853396Z [ERROR]   TestDeltaLakeConnectorSmokeTest>BaseDeltaLakeConnectorSmokeTest.testVacuum:1444 expected:<...w4fcue7sc/regionkey=[4/20220908_211930_00531_s7vxm-a3d7c348-1d06-416d-85ec-61cdb2032f70",

// StatementAnalyzer calling getMergeRowIdColumnHandle results in "This connector does not support modifying table rows"
./test (plugintrino-cassandra)/8_Maven Tests.txt-7423-2022-09-08T21:20:18.8470038Z [ERROR]   TestCassandraConnectorTest.testDelete:1261->execute:1421 » QueryFailed This co...
./test (plugintrino-cassandra)/8_Maven Tests.txt-7424-2022-09-08T21:20:18.8471254Z [ERROR]   TestCassandraConnectorTest.testDeleteWithComplexPredicate:1288 
./9_test (plugintrino-cassandra).txt-8432-2022-09-08T21:20:18.8470049Z [ERROR]   TestCassandraConnectorTest.testDelete:1261->execute:1421 » QueryFailed This co...
./9_test (plugintrino-cassandra).txt-8433-2022-09-08T21:20:18.8471261Z [ERROR]   TestCassandraConnectorTest.testDeleteWithComplexPredicate:1288 

// Message "Table was updated by a different transaction. Please retry the operation."
./7_test (plugintrino-raptor-legacy).txt-23170-2022-09-08T21:28:56.9760423Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorBucketedConnectorTest.testUpdateRowConcurrently
./7_test (plugintrino-raptor-legacy).txt-23177-2022-09-08T21:28:56.9772731Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorConnectorTest.testUpdateRowConcurrently
./test (plugintrino-raptor-legacy)/8_Maven Tests.txt-22154-2022-09-08T21:28:56.9760410Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorBucketedConnectorTest.testUpdateRowConcurrently

// Every Kudu UPDATE and DELETE test fails.  Kudu is the only CHANGE_ONLY_UPDATED_COLUMNS connector
./test (plugintrino-kudu)/8_Maven Tests.txt-15811-2022-09-08T21:21:16.6856934Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDelete:3146->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_1pwukwds6h WHERE custkey <= 100
./test (plugintrino-kudu)/8_Maven Tests.txt-15812-2022-09-08T21:21:16.6864109Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteAllDataFromTable:3349 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15813-2022-09-08T21:21:16.6869039Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithComplexPredicate:3202->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_complex_qyl5tkswj9 WHERE orderkey % 2 = 0
./test (plugintrino-kudu)/8_Maven Tests.txt-15814-2022-09-08T21:21:16.6870407Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithLike:3189->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_with_like_157xxf80y8 WHERE name LIKE '%a%'
./test (plugintrino-kudu)/8_Maven Tests.txt-15815-2022-09-08T21:21:16.6871445Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSemiJoin:3277->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15816-2022-09-08T21:21:16.6872430Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSubquery:3221->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15817-2022-09-08T21:21:16.6876189Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithVarcharPredicate:3310->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_with_varchar_predicate_oxhqby6wpr WHERE orderstatus = 'O'
./test (plugintrino-kudu)/8_Maven Tests.txt-15818-2022-09-08T21:21:16.6879237Z [ERROR]   TestKuduConnectorTest.testExplainAnalyzeWithDeleteWithSubquery:338->AbstractTestQueryFramework.assertExplainAnalyze:434->AbstractTestQueryFramework.assertExplainAnalyze:439->AbstractTestQueryFramework.assertExplainAnalyze:448->AbstractTestQueryFramework.computeActual:217 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15819-2022-09-08T21:21:16.6880319Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testRowLevelDelete:3360->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15820-2022-09-08T21:21:16.6881183Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testUpdate:3370->AbstractTestQueryFramework.assertQueryFails:348 
./26_test (plugintrino-kudu).txt-16824-2022-09-08T21:21:16.6856977Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDelete:3146->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_1pwukwds6h WHERE custkey <= 100
./26_test (plugintrino-kudu).txt-16825-2022-09-08T21:21:16.6864135Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteAllDataFromTable:3349 » QueryFailed
./26_test (plugintrino-kudu).txt-16826-2022-09-08T21:21:16.6869063Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithComplexPredicate:3202->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_complex_qyl5tkswj9 WHERE orderkey % 2 = 0
./26_test (plugintrino-kudu).txt-16827-2022-09-08T21:21:16.6870413Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithLike:3189->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_with_like_157xxf80y8 WHERE name LIKE '%a%'
./26_test (plugintrino-kudu).txt-16828-2022-09-08T21:21:16.6871451Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSemiJoin:3277->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16829-2022-09-08T21:21:16.6874252Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSubquery:3221->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16830-2022-09-08T21:21:16.6876198Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testDeleteWithVarcharPredicate:3310->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_with_varchar_predicate_oxhqby6wpr WHERE orderstatus = 'O'
./26_test (plugintrino-kudu).txt-16831-2022-09-08T21:21:16.6879250Z [ERROR]   TestKuduConnectorTest.testExplainAnalyzeWithDeleteWithSubquery:338->AbstractTestQueryFramework.assertExplainAnalyze:434->AbstractTestQueryFramework.assertExplainAnalyze:439->AbstractTestQueryFramework.assertExplainAnalyze:448->AbstractTestQueryFramework.computeActual:217 » QueryFailed
./26_test (plugintrino-kudu).txt-16832-2022-09-08T21:21:16.6880477Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testRowLevelDelete:3360->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16833-2022-09-08T21:21:16.6881188Z [ERROR]   TestKuduConnectorTest>BaseConnectorTest.testUpdate:3370->AbstractTestQueryFramework.assertQueryFails:348 
djsstarburst commented 2 years ago

All the Kudu delete and update tests are failing, and the problem has nothing to do with Kudu being a CHANGE_ONLY_UPDATED_COLUMNS connector.

The tests are failing because 1) the connector uses the presence of any rowid column as a signal to create a KuduUpdatablePageSource, but more fundamentally, the KuduRecordCursor created to read the file has no rowId column and so the copyPrimaryKey operation fails.

djsstarburst commented 2 years ago

The real reason that the Kudu connector was failing all DELETE and UPDATE tests was that those tests didn’t use the createTableForWrites mechanism, so the tables didn’t have the primary key declaration that Kudu requires. Converting the tests to use createTableForWrites is not simple, because most of the tests use CTAS, and it seems the create must be separated from the insert in order to specify partitioning.

djsstarburst commented 2 years ago

I pushed a commit, later to be squashed, fixing the Raptor failures.

The BaseConnectorTest DELETE and UPDATE tests mostly used CTAS, but connectors like Kudu that need a primary key in each table. In order to use the createTableForWrites mechanism, I separated table creation from insertions.

With that change, all but two of the Kudu DELETE and UPDATE tests now pass. One of the failures is doing an EXPLAIN ANALYZE CREATE TABLE AS SELECT and don't see quite how to intercede to add the primary key declaration. The second is testUpdateRowConcurrently, which fails with the backtrace below. I haven't debugged it yet.

2022-09-12T13:44:29.266-0500    ERROR   SplitRunner-4-149   io.trino.execution.executor.TaskExecutor    Error processing Split 20220912_184429_00112_hdtqh.1.0.0-4  (start = 2.09955255638553E9, wall = 54 ms, cpu = 0 ms, wait = 1 ms, calls = 2): GENERIC_INTERNAL_ERROR: Error while applying 1 kudu operation(s); First error: DELETE: Row error for primary key=[-128, 0, 0, 0], tablet=null, server=12977068ee544455ad936decc82bdcee, status=Not found: key not found (error 0)
io.trino.spi.TrinoException: Error while applying 1 kudu operation(s); First error: DELETE: Row error for primary key=[-128, 0, 0, 0], tablet=null, server=12977068ee544455ad936decc82bdcee, status=Not found: key not found (error 0)
    at org.apache.kudu.client.KuduOperationApplier.verifyNoErrors(KuduOperationApplier.java:96)
    at org.apache.kudu.client.KuduOperationApplier.close(KuduOperationApplier.java:109)
    at io.trino.plugin.kudu.KuduPageSink.storeMergedRows(KuduPageSink.java:250)
    at io.trino.operator.MergeWriterOperator.addInput(MergeWriterOperator.java:113)
    at io.trino.operator.Driver.processInternal(Driver.java:416)
aletc1 commented 1 year ago

Hi @electrum ,

This is not solved. Test case:

CREATE TABLE axprod.main.test2 (
    order_date TIMESTAMP(6),
    customer VARCHAR
)
WITH (partitioning = ARRAY['month(order_date)'])

update axprod.main.test2 set order_date = cast('2022-01-01 23:50' as timestamp(6)) where customer in (select customer from axprod.main.test2 where 1=2)

Error:

Invalid descendant for DeleteNode or UpdateNode: io.trino.sql.planner.plan.ExchangeNode

I've been tracking this in #12422 but I was redirected to this issue by @findepi as duplicate. However, the problem remains open.

findepi commented 1 year ago

@aletc1 this change is not released yet, it's coming in Trino 404.