apache / pinot

Apache Pinot - A realtime distributed OLAP datastore
https://pinot.apache.org/
Apache License 2.0
5.41k stars 1.27k forks source link

[multistage] decouple plan/runtime API abstracts. #10657

Open walterddr opened 1 year ago

walterddr commented 1 year ago

Background

Currently we have multiple abstractions reused with different components in planner and runtime. it causes several problems

These are related with partition strategy, worker assignment and mailbox-pipeline breaker efforts

Proposed changes === Several abstract is being introduced and will replace the current abstract

  1. Step 1a: replace VirtualServer VirtualServer is now a ServerInstance + VirtualID, it will be replaced with Worker which is indicating parallelism of work. It: (1) is globally indexed per stage; (2) mapped to a single ServerInstance stored in StageMetadata, (3) contains partition or segment info which will be put into a new abstract called: WorkerMetadata

with this VirtualServer is completely removed, and we decoupled ServerInstance which is not useful in runtime from VirtualID/workerID which is used in runtime.

API/Class abstraction definitions

Here is a global view of what we need in terms of primitives Primitives

The Yellow objects with POJO definitions are what we plan to introduce, specifically:

Broker

Server

CC @Jackie-Jiang @xiangfu0 @ankitsultana @somandal @siddharthteotia

siddharthteotia commented 1 year ago

Nice. Will review

FYI @vvivekiyer

walterddr commented 1 year ago

step 1a and 1b are done via (#10665, #10673) and (#10681, #10669)

walterddr commented 1 year ago

collecting some feedback from folks, please feel free to comment below as well: