Closed djsstarburst closed 1 year ago
How is this expected to work for connectors that pass-through to some other system like JDBC based ones? Is MERGE supposed to be translated into separate queries ran within a single transaction (assuming transactions are supported and provide linearizable transactions?).
How is this expected to work for connectors that pass-through to some other system like JDBC based ones? I
The io.trino.spi.connector.ConnectorMetadata#applyDelete
should work as it does today.
It appears to me that for DELETE, there isn't a lot of point in going through the MergeProcessor
- - instead ISTM that DELETE should target the engine's MergeWriter
node. The input page for MergeProcessor.transformPage()
is more complicated than the page for MergeWriter
, and the complicated parts are useless if all the rows in the page should be deleted.
I have a WIP implementation of DELETE using MergeWriterNode
on this branch. Most of the delete tests pass. Here are the bugs shown by the tests:
LogicalPlanner
never finds the table scan of the target table.MergeSink.storeMergedRows()
must be ignored if metadata delete pushdown should apply but the pattern match fails. I've added machinery to ignore those rows, but in a few cases, the row counts are incorrect.Of the total of 24 tests DELETE tests in BaseIcebergConnectorTest
, which inherits the DELETE tests from `BaseConnectorTest, 17 pass and 7 fail. Here are the failures:
testDelete
and testDeleteWithSubquery
fail due to a DELETE predicate of FALSE, which caused the TableScanNode
to get optimized away.testExpireSnapshotsPartitionedTable
, testMetadataDelete
and testMetadataDeleteSimple
are all failing because they are expecting metadata delete to happen, and since it didn't, the count of files or partitions is higher than expected. The MergeWriterNode
implementation of DELETE must support real metadata delete for these tests to pass.testIfDeletesReturnsNumberOfRemovedRows
is failing because a DELETE that deleted no rows returned a count of zero, but the Iceberg-specific test was expecting no count in this case. A comment in the test implied that returning no count is in fact a bug.testOptimizedMetadataQueries
failed because, apparently, the wrong rows were deleted. This is the only example of a test failure due to the wrong rows being deleted.Bottom line:
TableScanNodes
and not supporting metadata delete - - whose solution is adding "optimizers" in the query planner to recognize these cases.testOptimizedMetadataQueries
failed.The explanation for the testOptimizedMetadataQueries
failure is also wrapped up in metadata delete. The test is setting property optimize_metadata_queries
which messes up merge-based delete, because of this in MetadataQueryOptimizer
:
@Override
public PlanNode optimize(PlanNode plan, Session session, TypeProvider types, SymbolAllocator symbolAllocator, PlanNodeIdAllocator idAllocator, WarningCollector warningCollector, TableStatsProvider tableStatsProvider)
{
if (!SystemSessionProperties.isOptimizeMetadataQueries(session)) {
return plan;
}
return SimplePlanRewriter.rewriteWith(new Optimizer(session, plannerContext, idAllocator), plan, null);
}
So apart from changing the behavior when no rows are deleted, all the other problems will be fixed by new/revised optimizers.
cc @alexjo2144
The problem of missing TableScanNodes
was solved by the same fix as with MERGE itself - - the commit from #13838 that doesn't throw and exception in rewriteModifyTableScan
when recreating the MergeWriterNode
if it doesn't find the TableScanNode
.
I added optimizer PushMergeWriterDeleteIntoConnector
whose PATTERN matches
TableFinish -> MergeWriter -> Project -> TableScan
. If it matches and ConnectorMetadata.applyDelete()
returns the table scan. This change allows all of the Iceberg metadata delete tests to pass.
However, there is one failing Iceberg test: testDeleteWithSubquery
. I'll look into that next.
testDeleteWithSubquery
is failing because optimizer InlineProjections
is scrambling the page block order, and MergeWriterOperator.addInput()
is expecting a specific block order. Apparently, that's a bug.
So MergeWriterOperator
must get a new constructor arg that specifies the channel mapping.
I chatted with @martint, who said that I should expect the blocks to be scrambled, and should use LocalExecutionPlanner
's mapping of column numbers to channels.
This WIP branch implements both DELETE and UPDATE using the MergeProcessor/MergeWriter
infrastructure, conditional on static booleans for now until the PR is nearer to being merged. I've run the existing DELETE and UPDATE tests for both Hive and Iceberg. Of course, the existing UPDATE tests do not test changing partition or bucket columns.
Here are the results from my new branch:
testUpdateFailsInExplicitTrinoTransaction
. It is failing because of a non-matching thrown exception message. To get the message right, HiveMetadata.beginInsertOrMerge
needs an error message argument.testDeleteWithSubquery
. The problem and the fix are the subject of the previous comment.In Hive, three delete tests are failing. testDeleteFailsInExplicitTrinoTransaction
is failing with the same non-matching thrown exception message as testUpdateFailsInExplicitTrinoTransaction
, and the fix is the same. testDeleteWithOriginalFiles
and testDeleteWithOriginalFilesWithWhereClause
are failing with the error message below. I have yet to look into this problem.
Unsupported local exchange partitioning MERGE [insert = hive:HivePartitioningHandle{buckets=1, hiveTypes=[]}]
Here is the task list:
LocalExecutionPlanner
to fix testDeleteWithSubquery
.testDeleteWithOriginalFiles
DELETE failures.We also need to decide on the details of the rollout process, and whether the new implementations of DELETE and UPDATE are used by default.
- In Hive, all UPDATE tests pass except one,
testUpdateFailsInExplicitTrinoTransaction
. It is failing because of a non-matching thrown exception message. To get the message right,HiveMetadata.beginInsertOrMerge
needs an error message argument.
Or the expected message in the test be updated.
Fix the non-matching thrown exception message problem.
A couple of Hive tests are failing because the exception message is wrong, e.g., it says “Inserting into…” instead of “Updating…“, etc. Apart from the specific test failures, it makes sense to me for the engine to convey the original operation to the connector.
The minimal fix is to add an enum arg to ConnectorMetadata.beginMerge() to say if the original operation is a DELETE, UPDATE or MERGE. An alternative would be a new API, maybe (recalling that I suck at names) ConnectorMetadata.beginModifyOperation(). For now I'll add the enum to beginMerge
to make progress.
Add the machinery to get the channel mapping from LocalExecutionPlanner to fix testDeleteWithSubquery.
I added that machinery, but it didn't fix testDeleteWithSubquery
Investigate the two Hive
testDeleteWithOriginalFiles
DELETE failures.
It looks like these failures are happening because MERGE is assuming that the merge target is bucketed with one bucket, which is false for original files.
I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message. We could update the error message and make it a little more generic ("Modifying the contents of table ..."), or just leave it as "Updating ..." as a general way of describing an operation that updates the contents of a table (INSERT, DELETE, UPDATE and MERGE generically "update" a table to have different content)
I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.
Or the expected message in the test be updated.
That's fine with me. I like "Updating ...".
I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.
Or the expected message in the test be updated.
Having looked at the CI build, I'm wondering if this is the right call. There are many test failures, all but a few due to the error message change, e.g. "merge" vs "update" or "delete". And most are due to the exceptions thrown in ConnectorMetadata
default methods.
Right now, the default implementations of ConnectorMetadata.get(Delete|Update|Merge)RowId()
and ConnectorMetadata.begin(Delete|Update|Merge)
throw an exception saying This connector does not support (deletes|updates|merges)
. But the methods for delete and update are going away.
Clearly the tests can be changed and don't really matter. But the larger point is that I think there is genuine value add it providing the user with an error message that is not confusing. If the user ran a delete, it is confusing to tell the user the merge operation failed.
Right now, the default implementations of
ConnectorMetadata.get(Delete|Update|Merge)RowId()
andConnectorMetadata.begin(Delete|Update|Merge)
throw an exception sayingThis connector does not support (deletes|updates|merges)
. But the methods for delete and update are going away.
What's the SPI backward compatibility story for this change?
What's the SPI backward compatibility story for this change?
@electrum theorizes that we don't need backward compatibility because all connectors that support row-by-row delete and update now support merge: Hive, Raptor, Iceberg, Delta Lake and Kudu.
What about connectors that others have written and maintain outside the Trino tree? IMO even if the SPI is unused in Trino it should stay deprecated for some releases before being removed.
And if you still want to remove it I'd argue it needs to come with useful guide on how to migrate to new SPI - because to me it doesn't seem straightforward.
I don't think the merge machinery should be polluted with such a flag only for the purpose of producing an error message.
Or the expected message in the test be updated.
Having looked at the CI build, I'm wondering if this is the right call.
I thought about it a bit more, and decided you guys are right, because everywhere the customer can see the error they can also see the SQL. I pushed a commit that changed the MERGE-related ConnectorMetadata
methods to throw an exception with this message:
This connector does not support modifying table rows
and changed the tests that expect to fail doing DELETE, UPDATE or MERGE to expect that message.
I've been whittling away at the errors turned up by the CI build. There was a serious bug, now fixed, in the calculated order of SET expressions what planning UPDATE. I add a test that showed the MERGE did not have the same bug.
I'm down to the point where it makes sense to list specific test failures:
These methods of the fault-tolerant tests are failing because the partition
argument of the SQLTask
fragment
is ill-formed, containing null for the key column. This causes an exception to be raised before it gets to the injected exception:
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testDelete
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testDeleteWithSubquery
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testUpdate
io.trino.faulttolerant.delta.TestDeltaTaskFailureRecoveryTest.testUpdateWithSubquery
There was a serious bug, now fixed, in the calculated order of SET expressions what planning UPDATE. I add a test that showed the MERGE did not have the same bug.
What is the bug? if it affects current release, can you please file an issue?
Here is the complete enumeration of the current test failures in the CI build. They haven't been analyzed yet. Almost all are delete failures, most failing because the connector does not support getMergeRowIdColumnHandle
. Yet.
./25_test (plugintrino-oracle).txt: Tests run: 482, Failures: 11, Errors: 0, Skipped: 60
./test (plugintrino-mariadb)/8_Maven Tests.txt: Tests run: 754, Failures: 8, Errors: 0, Skipped: 116
./11_test (plugintrino-delta-lake).txt: Tests run: 1130, Failures: 15, Errors: 0, Skipped: 125
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt: Tests run: 38, Failures: 6, Errors: 0, Skipped: 0
./test (plugintrino-postgresql)/8_Maven Tests.txt: Tests run: 522, Failures: 6, Errors: 0, Skipped: 57
./test (plugintrino-phoenix5)/8_Maven Tests.txt: Tests run: 394, Failures: 7, Errors: 0, Skipped: 93
./28_test (plugintrino-iceberg).txt: Tests run: 1871, Failures: 10, Errors: 0, Skipped: 11
./test (coretrino-main)/8_Maven Tests.txt: Tests run: 383, Failures: 0, Errors: 3, Skipped: 0
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt: Tests run: 38, Failures: 6, Errors: 0, Skipped: 0
./test (plugintrino-cassandra)/8_Maven Tests.txt: Tests run: 497, Failures: 12, Errors: 0, Skipped: 109
./30_test (clienttrino-jdbc,plugintrino-base-jdbc,plugintrino-.txt: Tests run: 515, Failures: 9, Errors: 0, Skipped: 62
./test (plugintrino-sqlserver)/8_Maven Tests.txt: Tests run: 533, Failures: 8, Errors: 0, Skipped: 60
./20_test (plugintrino-mysql).txt: Tests run: 889, Failures: 10, Errors: 0, Skipped: 148
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-delta)/8_Maven Tests.txt: Tests run: 19, Failures: 4, Errors: 0, Skipped: 0
./test (plugintrino-mysql)/8_Maven Tests.txt: Tests run: 889, Failures: 10, Errors: 0, Skipped: 148
./test (clienttrino-jdbc,plugintrino-base-jdbc,plugintrino-thrift,plugintrino-memory)/8_Maven Tests.txt: Tests run: 515, Failures: 9, Errors: 0, Skipped: 62
./test (plugintrino-delta-lake)/8_Maven Tests.txt: Tests run: 1130, Failures: 15, Errors: 0, Skipped: 125
./22_test (plugintrino-sqlserver).txt: Tests run: 533, Failures: 8, Errors: 0, Skipped: 60
./9_test (plugintrino-cassandra).txt: Tests run: 497, Failures: 12, Errors: 0, Skipped: 109
./7_test (plugintrino-raptor-legacy).txt: Tests run: 1134, Failures: 9, Errors: 0, Skipped: 93
./5_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt: Tests run: 19, Failures: 4, Errors: 0, Skipped: 0
./26_test (plugintrino-kudu).txt: Tests run: 488, Failures: 30, Errors: 0, Skipped: 84
./24_test (plugintrino-mariadb).txt: Tests run: 754, Failures: 8, Errors: 0, Skipped: 116
./29_test (plugintrino-phoenix5).txt: Tests run: 394, Failures: 7, Errors: 0, Skipped: 93
./test (plugintrino-raptor-legacy)/8_Maven Tests.txt: Tests run: 1134, Failures: 9, Errors: 0, Skipped: 93
./21_test (plugintrino-postgresql).txt: Tests run: 522, Failures: 6, Errors: 0, Skipped: 57
./test (plugintrino-iceberg)/8_Maven Tests.txt: Q Tests run: 1871, Failures: 10, Errors: 0, Skipped: 11
./1_test (coretrino-main).txt: Q Tests run: 383, Failures: 0, Errors: 3, Skipped: 0
./test (plugintrino-kudu)/8_Maven Tests.txt: Q Tests run: 488, Failures: 30, Errors: 0, Skipped: 84
./test (plugintrino-oracle)/8_Maven Tests.txt: Q Tests run: 482, Failures: 11, Errors: 0, Skipped: 60
What is the bug? if it affects current release, can you please file an issue?
I should have been clearer - - the "serious bug" was in this PR. There is no similar bug in the tip master UPDATE or MERGE. (Though I added another test to this PR to prove to myself that MERGE was fine.)
I've done a bit more analysis of the test failures above. Here is the summary:
JdbcMetadata
are caused because the have SUPPORTS_DELETE but deletes as planned using merge component fails, because getMergeRowIdColumnHandle
is not implemented. I'm not completely clear on how it should be implemented.TestIcebergMinioParquetConnectorSmokeTest.testDeleteRowsConcurrently
and the known failure of BaseConnectorTest.testDeleteWithSubquery
partition
argument of the SQLTask
fragment is ill-formed, containing null for the key column.BaseConnectorTest.testDeleteWithSubquery
and TestRaptorBucketedConnectorTest.testUpdateRowConcurrently
StatementAnalyzer
calls getMergeRowIdColumnHandle
which is undefined.Tasks:
DefaultJdbcMetadata.getMergeRowIdColumnHandle
. It appears that will get all the SQL database connectors passing all tests.partition
arguments in the fault-tolerant tests.testDeleteWithSubquery
.After defining DefaultJdbcMetadata.getMergeRowIdColumnHandle
, there is only one failure left in all the SQL database test suites, in BaseConnectorTest.testDelete
. The test fails because it calls beginMerge
running this delete, which deletes no rows. I'm guessing the delete was supposed to be pushed down to the connector but the optimization failed:
assertUpdate("DELETE FROM " + table.getName() + " WHERE orderkey > 5 AND orderkey < 4", 0);
The plan is helpfully logged:
Output[columnNames = [rows]]
│ Layout: [rows:bigint]
│ Estimates:
└─ TableCommit[target = mysql:tpch.test_delete_hzholi6dug tpch.test_delete_hzholi6dug]
│ Layout: [rows:bigint]
│ Estimates:
└─ LocalExchange[partitioning = SINGLE]
│ Layout: [partialrows:bigint, fragment:varbinary]
│ Estimates:
└─ RemoteExchange[type = GATHER]
│ Layout: [partialrows:bigint, fragment:varbinary]
│ Estimates:
└─ MergeWriter[table = mysql:tpch.test_delete_hzholi6dug tpch.test_delete_hzholi6dug]
│ Layout: [partialrows:bigint, fragment:varbinary]
│ Estimates:
└─ LocalExchange[partitioning = SINGLE]
│ Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
│ Estimates:
└─ RemoteExchange[type = REPARTITION]
│ Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
│ Estimates:
└─ Project[]
│ Layout: [orderkey:bigint, custkey:bigint, orderstatus:varchar(255), totalprice:double, orderdate:date, orderpriority:varchar(255), clerk:varchar(255), shippriority:integer, comment:varchar(255), operation:tinyint, field:bigint, insert_from_update:tinyint]
│ Estimates:
│ orderkey := null
│ custkey := null
│ orderstatus := null
│ totalprice := null
│ orderdate := null
│ orderpriority := null
│ clerk := null
│ shippriority := null
│ comment := null
│ operation := (CASE WHEN (("orderkey" > BIGINT '5') AND ("orderkey" < BIGINT '4')) THEN TINYINT '2' ELSE TINYINT '0' END)
│ insert_from_update := TINYINT '0'
└─ LocalExchange[partitioning = ROUND_ROBIN]
│ Layout: [orderkey:bigint, field:bigint]
│ Estimates:
└─ Values[]
Layout: [orderkey:bigint, field:bigint]
Estimates:
@martint and I looked at the failure in BaseConnectorTest.testDelete
, and concluded that we needed an optimizer analogous to RemoveEmptyDeleteRuleSet
for MergeWriter
.
We also looked at the failure of BaseConnectorTest.testDeleteWithSubquery
, and concluded that the problem is in LocalExecutionPlanner.visitMergeWriter
, because that code is assuming that the source layout and the node's output layout are the same. The two layouts are different.
I pushed a commit that implements RemoveEmptyDeleteRuleSet
for MergeWriter
and it fixes the problem shown by BaseConnectorTest.testDelete
.
I pushed a second commit that fixes BaseConnectorTest.testDeleteWithSubquery
. In the previous code, the pagePreProcessor
created by LocalExecutionPlanner.visitMergeWriter
was a NOOP, because source outputs were used for both layouts. The one-line change uses MergeWriterNode.projectedSymbols
as the expected layout.
With these commits the failure count is reduced enough that it becomes enumerable. There are 39 test failures, of which 20 are Kudu, which is totally broken for DELETE and UPDATE. Most of the rest of the failures are related to apparently illegal partition schemes. Kudu is the only CHANGE_ONLY_UPDATED_COLUMNS
connector, and that may have something to do with those failures.
Here are the test failures, with brief explanations:
// These fault-tolerant tests are failing because a different exception got raised due to illegal
// PartitionSchemas with message "Bucket to partition must be set before a partition function can be created"
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79492-2022-09-08T21:41:35.0768019Z [ERROR] TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:101
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79494-2022-09-08T21:41:35.0768932Z [ERROR] TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testUpdate:187
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-iceberg)/8_Maven Tests.txt-79496-2022-09-08T21:41:35.0770606Z [ERROR] TestIcebergTaskFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:74->BaseIcebergFailureRecoveryTest.lambda$testDelete$0:74
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80504-2022-09-08T21:41:35.0768022Z [ERROR] TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:101
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80506-2022-09-08T21:41:35.0768935Z [ERROR] TestIcebergQueryFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testUpdate:187
./6_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-80508-2022-09-08T21:41:35.0770611Z [ERROR] TestIcebergTaskFailureRecoveryTest>BaseIcebergFailureRecoveryTest.testDelete:74->BaseIcebergFailureRecoveryTest.lambda$testDelete$0:74
// Getting the wrong error message
./test (plugintrino-iceberg)/8_Maven Tests.txt-73311-2022-09-08T21:36:51.0857188Z [ERROR] TestIcebergAvroConnectorTest>BaseIcebergConnectorTest.testModifyingOldSnapshotIsNotPossible:5265
./28_test (plugintrino-iceberg).txt-74322-2022-09-08T21:36:51.0857200Z [ERROR] TestIcebergAvroConnectorTest>BaseIcebergConnectorTest.testModifyingOldSnapshotIsNotPossible:5265
// PartitionSchemas with message "Bucket to partition must be set before a partition function can be created"
./test (testingtrino-faulttolerant-tests, test-fault-tolerant-delta)/8_Maven Tests.txt-24288-2022-09-08T21:29:50.3089659Z [ERROR] TestDeltaTaskFailureRecoveryTest>BaseDeltaFailureRecoveryTest.testDelete:62->BaseDeltaFailureRecoveryTest.lambda$testDelete$0:62
./5_test (testingtrino-faulttolerant-tests, test-fault-toleran.txt-25299-2022-09-08T21:29:50.3089689Z [ERROR] TestDeltaTaskFailureRecoveryTest>BaseDeltaFailureRecoveryTest.testDelete:62->BaseDeltaFailureRecoveryTest.lambda$testDelete$0:62
// It's expecting a specific set of data files. I'm not sure that's reasonable
./test (plugintrino-delta-lake)/8_Maven Tests.txt-24576-2022-09-08T21:36:40.8853378Z [ERROR] TestDeltaLakeConnectorSmokeTest>BaseDeltaLakeConnectorSmokeTest.testVacuum:1444 expected:<...w4fcue7sc/regionkey=[4/20220908_211930_00531_s7vxm-a3d7c348-1d06-416d-85ec-61cdb2032f70",
./11_test (plugintrino-delta-lake).txt-25587-2022-09-08T21:36:40.8853396Z [ERROR] TestDeltaLakeConnectorSmokeTest>BaseDeltaLakeConnectorSmokeTest.testVacuum:1444 expected:<...w4fcue7sc/regionkey=[4/20220908_211930_00531_s7vxm-a3d7c348-1d06-416d-85ec-61cdb2032f70",
// StatementAnalyzer calling getMergeRowIdColumnHandle results in "This connector does not support modifying table rows"
./test (plugintrino-cassandra)/8_Maven Tests.txt-7423-2022-09-08T21:20:18.8470038Z [ERROR] TestCassandraConnectorTest.testDelete:1261->execute:1421 » QueryFailed This co...
./test (plugintrino-cassandra)/8_Maven Tests.txt-7424-2022-09-08T21:20:18.8471254Z [ERROR] TestCassandraConnectorTest.testDeleteWithComplexPredicate:1288
./9_test (plugintrino-cassandra).txt-8432-2022-09-08T21:20:18.8470049Z [ERROR] TestCassandraConnectorTest.testDelete:1261->execute:1421 » QueryFailed This co...
./9_test (plugintrino-cassandra).txt-8433-2022-09-08T21:20:18.8471261Z [ERROR] TestCassandraConnectorTest.testDeleteWithComplexPredicate:1288
// Message "Table was updated by a different transaction. Please retry the operation."
./7_test (plugintrino-raptor-legacy).txt-23170-2022-09-08T21:28:56.9760423Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorBucketedConnectorTest.testUpdateRowConcurrently
./7_test (plugintrino-raptor-legacy).txt-23177-2022-09-08T21:28:56.9772731Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorConnectorTest.testUpdateRowConcurrently
./test (plugintrino-raptor-legacy)/8_Maven Tests.txt-22154-2022-09-08T21:28:56.9760410Z [ERROR] io.trino.plugin.raptor.legacy.TestRaptorBucketedConnectorTest.testUpdateRowConcurrently
// Every Kudu UPDATE and DELETE test fails. Kudu is the only CHANGE_ONLY_UPDATED_COLUMNS connector
./test (plugintrino-kudu)/8_Maven Tests.txt-15811-2022-09-08T21:21:16.6856934Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDelete:3146->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_1pwukwds6h WHERE custkey <= 100
./test (plugintrino-kudu)/8_Maven Tests.txt-15812-2022-09-08T21:21:16.6864109Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteAllDataFromTable:3349 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15813-2022-09-08T21:21:16.6869039Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithComplexPredicate:3202->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_complex_qyl5tkswj9 WHERE orderkey % 2 = 0
./test (plugintrino-kudu)/8_Maven Tests.txt-15814-2022-09-08T21:21:16.6870407Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithLike:3189->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_with_like_157xxf80y8 WHERE name LIKE '%a%'
./test (plugintrino-kudu)/8_Maven Tests.txt-15815-2022-09-08T21:21:16.6871445Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSemiJoin:3277->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15816-2022-09-08T21:21:16.6872430Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSubquery:3221->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15817-2022-09-08T21:21:16.6876189Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithVarcharPredicate:3310->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_with_varchar_predicate_oxhqby6wpr WHERE orderstatus = 'O'
./test (plugintrino-kudu)/8_Maven Tests.txt-15818-2022-09-08T21:21:16.6879237Z [ERROR] TestKuduConnectorTest.testExplainAnalyzeWithDeleteWithSubquery:338->AbstractTestQueryFramework.assertExplainAnalyze:434->AbstractTestQueryFramework.assertExplainAnalyze:439->AbstractTestQueryFramework.assertExplainAnalyze:448->AbstractTestQueryFramework.computeActual:217 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15819-2022-09-08T21:21:16.6880319Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testRowLevelDelete:3360->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./test (plugintrino-kudu)/8_Maven Tests.txt-15820-2022-09-08T21:21:16.6881183Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testUpdate:3370->AbstractTestQueryFramework.assertQueryFails:348
./26_test (plugintrino-kudu).txt-16824-2022-09-08T21:21:16.6856977Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDelete:3146->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_1pwukwds6h WHERE custkey <= 100
./26_test (plugintrino-kudu).txt-16825-2022-09-08T21:21:16.6864135Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteAllDataFromTable:3349 » QueryFailed
./26_test (plugintrino-kudu).txt-16826-2022-09-08T21:21:16.6869063Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithComplexPredicate:3202->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_complex_qyl5tkswj9 WHERE orderkey % 2 = 0
./26_test (plugintrino-kudu).txt-16827-2022-09-08T21:21:16.6870413Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithLike:3189->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_with_like_157xxf80y8 WHERE name LIKE '%a%'
./26_test (plugintrino-kudu).txt-16828-2022-09-08T21:21:16.6871451Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSemiJoin:3277->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16829-2022-09-08T21:21:16.6874252Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithSubquery:3221->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16830-2022-09-08T21:21:16.6876198Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testDeleteWithVarcharPredicate:3310->AbstractTestQueryFramework.assertUpdate:298->AbstractTestQueryFramework.assertUpdate:303 Execution of 'actual' query failed: DELETE FROM test_delete_with_varchar_predicate_oxhqby6wpr WHERE orderstatus = 'O'
./26_test (plugintrino-kudu).txt-16831-2022-09-08T21:21:16.6879250Z [ERROR] TestKuduConnectorTest.testExplainAnalyzeWithDeleteWithSubquery:338->AbstractTestQueryFramework.assertExplainAnalyze:434->AbstractTestQueryFramework.assertExplainAnalyze:439->AbstractTestQueryFramework.assertExplainAnalyze:448->AbstractTestQueryFramework.computeActual:217 » QueryFailed
./26_test (plugintrino-kudu).txt-16832-2022-09-08T21:21:16.6880477Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testRowLevelDelete:3360->AbstractTestQueryFramework.assertUpdate:318->AbstractTestQueryFramework.assertUpdate:323 » QueryFailed
./26_test (plugintrino-kudu).txt-16833-2022-09-08T21:21:16.6881188Z [ERROR] TestKuduConnectorTest>BaseConnectorTest.testUpdate:3370->AbstractTestQueryFramework.assertQueryFails:348
All the Kudu delete and update tests are failing, and the problem has nothing to do with Kudu being a CHANGE_ONLY_UPDATED_COLUMNS
connector.
The tests are failing because 1) the connector uses the presence of any rowid column as a signal to create a KuduUpdatablePageSource, but more fundamentally, the KuduRecordCursor created to read the file has no rowId column and so the copyPrimaryKey operation fails.
The real reason that the Kudu connector was failing all DELETE and UPDATE tests was that those tests didn’t use the createTableForWrites
mechanism, so the tables didn’t have the primary key declaration that Kudu requires. Converting the tests to use createTableForWrites
is not simple, because most of the tests use CTAS, and it seems the create must be separated from the insert in order to specify partitioning.
I pushed a commit, later to be squashed, fixing the Raptor failures.
The BaseConnectorTest
DELETE and UPDATE tests mostly used CTAS, but connectors like Kudu that need a primary key in each table. In order to use the createTableForWrites
mechanism, I separated table creation from insertions.
With that change, all but two of the Kudu DELETE and UPDATE tests now pass. One of the failures is doing an EXPLAIN ANALYZE CREATE TABLE AS SELECT
and don't see quite how to intercede to add the primary key declaration. The second is testUpdateRowConcurrently
, which fails with the backtrace below. I haven't debugged it yet.
2022-09-12T13:44:29.266-0500 ERROR SplitRunner-4-149 io.trino.execution.executor.TaskExecutor Error processing Split 20220912_184429_00112_hdtqh.1.0.0-4 (start = 2.09955255638553E9, wall = 54 ms, cpu = 0 ms, wait = 1 ms, calls = 2): GENERIC_INTERNAL_ERROR: Error while applying 1 kudu operation(s); First error: DELETE: Row error for primary key=[-128, 0, 0, 0], tablet=null, server=12977068ee544455ad936decc82bdcee, status=Not found: key not found (error 0)
io.trino.spi.TrinoException: Error while applying 1 kudu operation(s); First error: DELETE: Row error for primary key=[-128, 0, 0, 0], tablet=null, server=12977068ee544455ad936decc82bdcee, status=Not found: key not found (error 0)
at org.apache.kudu.client.KuduOperationApplier.verifyNoErrors(KuduOperationApplier.java:96)
at org.apache.kudu.client.KuduOperationApplier.close(KuduOperationApplier.java:109)
at io.trino.plugin.kudu.KuduPageSink.storeMergedRows(KuduPageSink.java:250)
at io.trino.operator.MergeWriterOperator.addInput(MergeWriterOperator.java:113)
at io.trino.operator.Driver.processInternal(Driver.java:416)
Hi @electrum ,
This is not solved. Test case:
CREATE TABLE axprod.main.test2 (
order_date TIMESTAMP(6),
customer VARCHAR
)
WITH (partitioning = ARRAY['month(order_date)'])
update axprod.main.test2 set order_date = cast('2022-01-01 23:50' as timestamp(6)) where customer in (select customer from axprod.main.test2 where 1=2)
Error:
Invalid descendant for DeleteNode or UpdateNode: io.trino.sql.planner.plan.ExchangeNode
I've been tracking this in #12422 but I was redirected to this issue by @findepi as duplicate. However, the problem remains open.
@aletc1 this change is not released yet, it's coming in Trino 404.
Implementing SQL DELETE and UPDATE using MERGE machinery has lots of benefits:
The strategy is simple: The statement analyzer code for DELETE and UPDATE will be almost unchanged. The query planner will continue to plan DELETE and UPDATE separately from MERGE. However, the DELETE and UPDATE plans will create pages that can be passed to
MergeRowProcessor.transformPage
.