confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
41 stars 1.04k forks source link

Reorder messages by timestamp #8440

Open ybyzek opened 2 years ago

ybyzek commented 2 years ago

Is your feature request related to a problem? Please describe.

Use case 1: Receiving stock market data from an external source, however, the source sometimes delivers the market data out of order. Need to re-order the records by the timestamp within a certain time window (say 30 second) in ksqlDB, so that the downstream topics will have the results back in the right order

Use case 2: Game provider clients can be offline and accumulate messages, then when they come online the messages are (sometimes) delivered. Need to re-order the messages for proper processing.

Describe the solution you'd like

Built-in function that re-orders records within a given window.

Describe alternatives you've considered

Kafka Streams example: https://github.com/confluentinc/kafka-streams-examples/pull/411

Additional context Add any other context or screenshots about the feature request here.

ybyzek commented 2 years ago

At a high-level, this seems related to ORDER BY (https://github.com/confluentinc/ksql/issues/1572)

mjsax commented 2 years ago

Just to dump a view thoughts:

gphilipp commented 1 year ago

It's a problem that we are currently facing too. Eg the Salesforce source KC connector produces messages without a key. If you use a topic with multiple partitions to store those messages, they will end up in random partitions and you'll possibly process them out of order.