confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
127 stars 1.04k forks source link

Limit and Offset and in PULL queries #7790

Open daBlesr opened 3 years ago

daBlesr commented 3 years ago

Is your feature request related to a problem? Please describe. I have a table with the latest info on devices in ksqlDB, and I want to get this info displayed in a paginated list on a web page. Pagination of pull queries does not work if there is no option to provide limits and offsets for table scans.

Describe the solution you'd like Be able to add LIMIT and OFFSET to pull queries.

Describe alternatives you've considered Not adding limit and offset gives me all of the devices, which are too many. Push queries do not really work, because I want to see information on non-responsive devices as well.

Additional context

SELECT * FROM QUERYABLE_DEVICES WHERE VERSION = 'x.y.z' LIMIT 20 OFFSET 30;
agavra commented 3 years ago

Thanks for the ticket @daBlesr - Can you describe what you meant by OFFSET? When there are multiple partitions, the semantics of specifying an OFFSET isn't clear to me.

Linking the LIMIT specific ticket: #6962

daBlesr commented 3 years ago

I am not quite at the level yet to understand how Pull queries are dealing with multiple partitions, but I'd figure that if the pull query gives results ordered by some consistent order (it does on my machine), the offset is just the n'th row in that consistent resultset.

agavra commented 3 years ago

Consistent ordering is an interesting point for us to consider. I believe at the moment there are no guarantees (esepcially when multiple partitions are there) but that might be something that we'd want to implement so pagination can be more efficient.

Also I had originally misinterpreted your question as asking to expose Kafka's offset in pull queries, but after reading it again it's clear you meant SQL's concept of OFFSET! Ooops 😬

\cc @AlanConfluent @vvcephei

nkonev commented 2 years ago

https://github.com/confluentinc/ksql/pull/8333

lteacher commented 2 years ago

HI, thanks for linking the PR above. However when attempting to use this feature in ksqldb I get the following error still: "Pull queries don't support LIMIT clauses. See https://cnfl.io/queries for more info. Add EMIT CHANGES if you intended to issue a push query."

Unfortunately I had issues trying to marry up versions and releases with the source here vs the actual client / server combinations etc. For instance the ksqldb client / server combination running for me is 0.23.1

Am I missing something and this feature is supposed to be working somehow etc? Iam using the /query-stream endpoint usually but the same message occurs in the command line client.

@cprasad1

cprasad1 commented 2 years ago

@lteacher LIMIT support for pull queries will be rolled out with the 0.24 release :)

bmmptlgc commented 11 months ago

I have a use case for Offset. I am publishing poison messages to a dead letter compacted topic. I created a KTable from that topic to represent the state of any given poison message (pending/republished). A backend service exposes that table to a UI, offering some filtering capabilities and the ability to republish messages to their original topics.

The mechanism is working. When a poison message is republished, the corresponding record on the KTable is updated to republished and will no longer show up on the ui.

We could have thousands of messages than need to be looked at and eventually republished. We need the ability to implement paging on the ui and the backend service. Right now I can limit the number of records. Say I limit to 30 records, but there are an additional 50 that satisfy the filter. If I could use Offset, I could ask the KTable for the next 30 records, starting at the offset 30, to get the second page and then the next 30, starting at the offset 60 (which would return the last 20 records, the last page).

Taymindis commented 19 hours ago

This issue has been fixed by KSQLDB-JFG with limit and token