redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.72k stars 591 forks source link

datalake: split record translator into distinct modes #24220

Closed andrwng closed 1 week ago

andrwng commented 1 week ago

Splits the record_translator into two implementations: one that translates key-value records, and another that expects a schema. The expectation is that users will explicitly choose which to use based on a topic property.

With this change, key-value tables will have a redpanda.value column, while structured tables will only have user-defined columns instead.

A "default translator" is left in with similar properties to the prior record_translator, but this is just left in to smooth the transition until we have a topic config.

NOTE: in this commit, we leave as is the invalid record handling, which previously allowed going between key-value and structured data. Once we have a more explicit toggle, this handling should replaced with a dead-letter table or by dropping records on the floor.

Backports Required

Release Notes

vbotbuildovich commented 1 week ago

/backport v24.3.x