risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
7.09k stars 585 forks source link

Improving usability of subscription #18208

Open hzxa21 opened 3 months ago

hzxa21 commented 3 months ago

Feel free to post more ideas under this issue.

hzxa21 commented 3 months ago

When user session is active (i.e. client-FE connection is alive), automatically retry querying log store on retryable errors, including cluster recovery, query stream timeout, transient network error between FE and CN.

Recently we found that if user declare a cursor but doesn't fetch it frequently, it may cause the query stream remain valid but unpolled for a long time, which may causes the storage epoch being pinned for a long time. We may extend the above idea to actively shutdown idle query stream and re-create one from the previous pos in log store when the cursor is fetched again.

lmatz commented 2 months ago

https://docs.risingwave.com/docs/current/subscription/#persisting-the-consumption-progress requires users to persist in the progress by themselves. We can let RW handle this internally by exposing an "ack" function to users.

In terms of what RW does underneath the "ack",

cur.execute("INSERT INTO subscription_progress (sub_name, progress)", (sub_name, progress, progress))
cur.execute("FLUSH")

I wonder if there could be cases (e.g. barriers already pile up in the system?) where "flush" takes a non-negligible period of time, which makes "exactly-once delivery" and getting the latest results at low latency via subscribe not achievable at the same time.