confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
102 stars 1.04k forks source link

Additional pseudo-columns for OFFSET and PARTITION #976

Closed blueedgenick closed 2 years ago

blueedgenick commented 6 years ago

Especially when using KSQL to inspect some data or troubleshoot a potential problem with topic data it can be useful to have access to more metadata about each individual message than is currently supplied. The obviously-missing counterparts to the ROWKEY and ROWTIME pseudo-columns we already supply are equivalents for the offset and partition-number.

Example use-case for inspecting suspect data: select * from foo where partition=0 and offset between 12 and 15;

rmoff commented 6 years ago

I agree that this would be great metadata to expose.

rmoff commented 6 years ago

Could also be useful for avoiding poisoned messages on a topic?

SELECT * FROM FOO WHERE OFFSET!=42;
shankarsg commented 5 years ago

It would be helpful to find exact location of specific message in a topic , ROWTIME timestamp and Message timestamp's milliseconds difference can be handled.

bentdan commented 4 years ago

I know this is an old conversation, but this would be great to have! It feels like something that should be there and is missing. It would also be very nice to be able to select the underlying topics offset and partition in a select. SELECT OFFSET(), PARTITION() FROM FOO; alternatively, having the ability to add partition and offset when creating a stream or table would be another way of providing this functionality. Create Table FOO (MYKEY VARCHAR, OFFSET(), PARTITION() ) WITH (KAFKA_TOPIC='my-topic', KEY='MYKEY');

big-andy-coates commented 4 years ago

With KLIP-14 now merged, this should now be achievable. It will need a klip though.

nikhilkapre commented 3 years ago

Can the offset somehow be included by defining a UDF? Additionally, is that a good practice to follow. Also, can the rowtimestamp be used as an accurate alternative on this one?

mjsax commented 2 years ago

Resolved via https://github.com/confluentinc/ksql/blob/master/design-proposals/klip-50-partition-and-offset-in-ksqldb.md