apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.41k stars 1.27k forks source link

[multistage] [bug] Sort operator issue when # of pinot servers is greater 1 #9717

Open 61yao opened 1 year ago

61yao commented 1 year ago

StagePlan is sent to all available intermediate stage servers. When # of intermediate stage servers is greater than 1, only first server will receive leaf server data.

The other servers will spin operator waiting on mailbox.

Correctness: if any of first server returns first, we will just return error.

Wasted CPU: All servers except 1, will spin CPU on waiting something they don’t need. With non-blocking fix(https://github.com/apache/pinot/issues/9615#issuecomment-1282911839), this will become a memory “leak” problem.

HotSpot: First server gets way more load than the other server

Should be related to https://github.com/apache/pinot/issues/9611

61yao commented 1 year ago

I double checked the code again. there may or may not be correctness issue. I am not sure. we should test it out. My concern is more for Singleton exchange. Turns out that's left join not sort. Hot spot issue seems bad because SortOperator sends result to the same node. @walterddr