Open gavin-norman-sociomantic opened 7 years ago
Some of the relevant papers:
http://research.ijcaonline.org/volume111/number5/pxc3901190.pdf https://pdfs.semanticscholar.org/efef/8122249982d7990c0dde29eb9f147498cb01.pdf http://infocom2004.ieee-infocom.org/Papers/46_4.PDF http://airccse.org/journal/ijdps/papers/4313ijdps02.pdf http://ijssst.info/Vol-03/No-1&2/Kabalan.pdf
With a plain round-robin algorithm, if one node is responding slowly, pending requests can stack up in the client. A common pattern, in this situation, would be for the client to detect this and suspend its input (a stream from a DMQ, for example).
One technique which could improve this situation is to have each node selected by the round-robin algorithm based on some weighting or priority. When a connection responds slowly, its weighting is reduced, meaning that less requests will be sent to it. (If all connections respond slowly, all of their weightings will go down, meaning that the balance of requests won't change.)
How exactly the client should measure the responsiveness of each connection is an open question.
This is not as simple as it seems, however. We also need to take into account the side-effects of such user-facing load-balancing. One consequence of more requests being sent to certain nodes is that those nodes may end up storing more data than others, which is not ideal and leads to further imbalance (e.g. data returned to query requests). This will require more thought.