risingwavelabs / risingwave

Best-in-class stream processing, analytics, and management. Perform continuous analytics, or build event-driven applications, real-time ETL pipelines, and feature stores in minutes. Unified streaming and batch. PostgreSQL compatible.
https://go.risingwave.com/slack
Apache License 2.0
6.95k stars 574 forks source link

bug: source splits can be unevenly assigned to workers when there are too many actors #14333

Open fuyufjh opened 9 months ago

fuyufjh commented 9 months ago

When three CNs' memory usage is uneven, it will be OOM.

Yeah. I'd like to focus on this abnormality first. It began since reglngvty-20231228-150237, whil reglngvty-20231227-150231 looked normal.

Number of actor per node is even...🤔

The source split is not even since nightly-20231228

count(source_partition_input_bytes{namespace=~"$namespace",risingwave_name=~"$instance",risingwave_component=~"$component",pod=~"$pod"}) by (pod)

reglngvty-20231228-150237 (nightly-20231228)

image

reglngvty-20231227-150231 (nightly-20231227)

image

Code diff: https://github.com/risingwavelabs/risingwave/compare/4695ad1239b1c160228dd7bf6f473634f57c9834...aa9dcac98985f9650595ef4b67e43d2601e41a44

Any ideas? cc. @shanicky

Originally posted by @fuyufjh in https://github.com/risingwavelabs/risingwave/issues/14324#issuecomment-1874984343

xxchan commented 9 months ago

Is it possible to be caused by https://github.com/risingwavelabs/risingwave/pull/14170

especially the last commit

xxchan commented 9 months ago

It seems previously the assignment is kind of random (?), but now all assigned to one node.

xxchan commented 9 months ago

How many source actors will be created in this case? If there are >24 actors in a node I think https://github.com/risingwavelabs/risingwave/pull/14170 will lead to the problem.. 🤔

xxchan commented 9 months ago

I get it. We have 3 sources, each one have 8 partitions.

For each source, each compute node has 8 source actors because it has 8 cores. 24 actors in total. Previously it's randomly assigned, so sometimes it's relatively even (7-8-9), sometimes it's uneven (3-8-13), and can OOM.

Now I added a cmp by actor id to make the assignment deterministic, so all 8 splits will be assigned to 8 actors with lowest actor ids (on the same node). This is the same for all 3 sources, so one node will be assigned 24 splits.

The previous behavior is also not ideal. How can we improve that? 🤔 Basically the problem is SourceManager is not aware of the cluster now.

lmatz commented 9 months ago

https://buildkite.com/risingwave-test/longevity-test/builds/883#018ccab2-3a2b-4d12-aace-6757affb4abe

SCR-20240103-qoz

SCR-20240103-qo5

1 topic, 1 unified source, parallelism 3

create MV for each logical source, 3 mvs in total

and there are 8*25 nexmark MVs built on top of these 3 base MVs

xxchan commented 9 months ago

Wait, if streaming_parallelism is 3, my reasoning above doesn't seem correct. 🤡

xxchan commented 9 months ago

But we do have 8 actors on a node, is there something wrong?

image
xxchan commented 9 months ago

I guess in the benchmark script, the paralleism only affects query MVs, but not the 3 source MVs, so my reasoning still applies.

shanicky commented 9 months ago

But we do have 8 actors on a node, is there something wrong?

image

Only actors assigned with splits are displayed here. You can check the actor panel, where you will find that out of 96 actors, only the first 8 actors were assigned splits. In other words, only the first cn was assigned splits.

shanicky commented 7 months ago

shall we close this issue as fixed? cc @xxchan

xxchan commented 7 months ago

I'm not sure. We haven't implemented rack-aware scheduling, so the problem can still happen. Do you think it's not a large concern and we will not implement it in near future? 🤔