confluentinc / ksql

The database purpose-built for stream processing applications.
https://ksqldb.io
Other
116 stars 1.04k forks source link

Pull queries: Support columns other than ROWKEY in WHERE clause #4217

Closed vpapavas closed 3 years ago

vpapavas commented 4 years ago

Is your feature request related to a problem? Please describe. Consider a table created with a GROUP BY clause containing multiple columns like so:

CREATE TABLE pageviews_table AS
  SELECT viewtime, userid, pageid, COUNT(*) AS TOTAL
  FROM pageviews_original
  GROUP BY viewtime, userid, pageid

Now, if we want to issue a pull query to select a specific row, we have to provide the ROWKEY in the WHERE clause as a concatenated string consisting of all the grouping keys like so:

SELECT viewtime, userid, pageid 
FROM pageviews_table 
WHERE ROWKEY = '1557183930687|+|User_9|+|Page_34';

Describe the solution you'd like Support pull queries that allow multiple key columns in the WHERE clause like so:

SELECT viewtime, userid, pageid 
FROM pageviews_table 
WHERE viewtime = 1557183930687 AND userid='User_9' and pageid='Page_34';
apurvam commented 4 years ago

related: #3584

vickyshah129 commented 4 years ago

We have a case in our application. Where kafka is the center of all the information flowing between all services. Our application is realtime monitoring system where we save data from different collectors. This data is immutable. As we are saving all the data currently in cassandra as well. I am of the view that cassandra is an overhead because it is just the replica of data which is already there in kafka. We should have the facility to query our data using ksql where we can fetch our data based on other columns except just ROWKEY. For instance, i want to fetch all the data for specific month, day, year etc. I believe that, this feature is vital where we do not want our data to be modified and just go back in time to fetch different data.

agavra commented 3 years ago

I believe this has been implemented - @AlanConfluent can you confirm this?

AlanConfluent commented 3 years ago

That's right. Between the #6814 and #6939, we should be able to support other columns other that rowkey. We can close this once the latter is merged.