Open hashhar opened 3 years ago
The reason for the failure is that in the DecoderModule (presto-row-decoders) installs a DecoderFactory with the name "dummy". The EncoderModule doesn't do this.
Then in KafkaMetadata we assign "dummy" to dataFormat if it's missing at https://github.com/prestosql/presto/blob/master/presto-kafka/src/main/java/io/prestosql/plugin/kafka/KafkaMetadata.java#L105.
Now when the DispatchingRowEncoder tries to find a factory by the name "dummy" it can't find any and hence the precondition check fails at https://github.com/prestosql/presto/blob/master/presto-kafka/src/main/java/io/prestosql/plugin/kafka/encoder/DispatchingRowEncoderFactory.java#L40.
One option I see is to add a DUMMY row encoder too which writes null bytes. But null bytes can have a different meaning for each data format. (I've tried this and it works but it can lead to people being confused about why the data inserted is all NULLs).
Or even simpler and proper would be to just change the exception message in the precondition check at https://github.com/prestosql/presto/blob/master/presto-kafka/src/main/java/io/prestosql/plugin/kafka/encoder/DispatchingRowEncoderFactory.java#L40 and to move it earlier (ideally during connector startup).
Creating a table definition without
key.dataFormat
causes INSERTs to fail in the Kafka Connector.This can be observed with the following test case (place into TestKafkaAvroSmokeTest)
Adding below to table definition is sufficient to make tests pass.
This means that either the documentation is misleading (see https://prestosql.io/docs/current/connector/kafka.html#table-definition-files) or that our implementation doesn't match our expectation.
The offending code is at https://github.com/prestosql/presto/blob/a82b00ba9c64134c14f94e82069a793dc6aa8e33/presto-kafka/src/main/java/io/prestosql/plugin/kafka/KafkaPageSinkProvider.java#L80.
Stack trace of failure:
cc: @findepi