CentaurusInfra / global-resource-service

Apache License 2.0
0 stars 4 forks source link

Make resource allocation ( total number of machines ) exact as requested numbers. #82

Open q131172019 opened 2 years ago

q131172019 commented 2 years ago

In test for 500K nodes / 2 regions / 20 schedulers / 25K nodes per scheduler, the first 19 schedulers are successfully allocated with requested machines greater than 25k nodes due to overhead so that the remaining nodes are less than 25K. The result is the 20th scheduler is not allocated with 25k requested machines due to "Not enough hosts"

I0711   18:53:40.867457   18611   installer.go:48] handle client registration
E0711 18:53:40.869402   18611 distributor.go:66] Error allocate   resource for client. Error Not enough hosts
I0711 18:53:40.869424   18611 installer.go:79] error register   client. error Not enough hosts
yb01 commented 2 years ago

thanks for filing this issue to track this issue. this is due to the fact that distributor allocates machines in slices, so it is not exactly the number of machines the client requested, it is a bit over per the size of the slices being allocated to the client. which can be ~30 or so,

yb01 commented 2 years ago
  1. first assume no new nodes or new RP added to a Region.
  2. then, add new RP to a region
  3. assume no new node change in an RP
yb01 commented 2 years ago

evaluate first step for 930 for cost.