risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.8k stars 564 forks source link

Discussion: making streaming parallelism a alterable property of a materialized view or a sink #9504

Open lmatz opened 1 year ago

lmatz commented 1 year ago

so that the user can adjust the parallelism by using alter statement in SQL.

This is coarse-grained adjustment, less than what risectl is able to do, e.g. risectl meta reschedule "<fragment>-[<parallel unit>]+[<parallel unit>]".

But it gives user a way of doing so without relying on RW Cloud, risectl or dashboard. And all in SQL. And I guess a better, finer-grained strategy can take some time to be reliable.

shanicky commented 1 year ago

FYI: Citus uses this method to rebalance shards. https://docs.citusdata.com/en/v9.1/develop/api_udf.html#rebalance-table-shards And CosmosDB https://learn.microsoft.com/en-us/azure/cosmos-db/postgresql/howto-scale-rebalance

BugenZhao commented 1 year ago

Altering the parallelism through SQL interface LGTM, a related work might be #8803.

However IMO, ALTER PARALLELISM should be treated as a wrapper of the high-level scaling interface proposed by @shanicky, which hides some details of the low-level parallel-unit interface. That is to say, the underlying state of scheduling should still be the "parallel-unit matrix" where different fragments may have different parallelisms, so I'm afraid that it's not possible to simply have a physical property named "parallelism" for streaming jobs that users can understand easily or fully manage themselves.

lmatz commented 1 year ago

I don't totally get it, set streaming_parallelism to X is exposed to and adjusted by users, and it's a good simple abstraction.

We can still let alter statement follow this abstraction, even if it sacrifices some degree of freedom (do I interpret "parallel-unit matrix" where different fragments may have different parallelisms correctly?)

BugenZhao commented 1 year ago

We can still let alter statement follow this abstraction

Agreed. My idea is that "making streaming parallelism an alterable property" looks good if it's only a user-facing abstraction. For the internal implementation, we'll still maintain the complicated "matrix" as the state. 🫡

fuyufjh commented 1 year ago

I think this has been done with risectl commands?

github-actions[bot] commented 2 months ago

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean. Don't worry if you think the issue is still valuable to continue in the future. It's searchable and can be reopened when it's time. 😄