A table with primary key only columns with cdc enabled, will results in errors in the connector. My initial guess is primary key only tables are not supported because we rely on the value part of the key value schema (that is generated from non-primary key columns) to indicate a delete by passing null values. In the case of primary key only, it is not clear how deletes will be differentiated from inserts because the value is always null.
To reproduce:
Create a table with PK only
CREATE TABLE source.results (
uuid text,
value int,
PRIMARY KEY ((uuid, value))
)
WITH cdc = TRUE;
Enable CDC connectors
The following error will be logged and no events will be sent to the data topic
2023-04-11T15:35:02,097+0000 [public/cdc-test/origin-results-0] ERROR org.apache.pulsar.functions.instance.JavaInstanceRunnable - [public/cdc-test/origin-results:0] Uncaught exception in Java Instance
java.util.concurrent.CompletionException: com.datastax.oss.driver.api.core.servererrors.SyntaxError: line 1:7 no viable alternative at input 'FROM' (SELECT [FROM]...)
at java.util.concurrent.CompletableFuture.reportJoin(CompletableFuture.java:412) ~[?:?]
at java.util.concurrent.CompletableFuture.join(CompletableFuture.java:2044) ~[?:?]
at com.datastax.oss.pulsar.source.CassandraSource.batchRead(CassandraSource.java:574) ~[?:?]
at com.datastax.oss.pulsar.source.CassandraSource.maybeBatchRead(CassandraSource.java:463) ~[?:?]
at com.datastax.oss.pulsar.source.CassandraSource.read(CassandraSource.java:455) ~[?:?]
The desired behavior
Have the events go through (consider using a metadata in the schema to indicate DELETE vs INSERT)
A table with primary key only columns with cdc enabled, will results in errors in the connector. My initial guess is primary key only tables are not supported because we rely on the value part of the key value schema (that is generated from non-primary key columns) to indicate a delete by passing null values. In the case of primary key only, it is not clear how deletes will be differentiated from inserts because the value is always null.
To reproduce:
The desired behavior