Least-busy invocation policy

oberstet commented 8 years ago

Implement a least-busy invocation policy: a call incoming to the router is forwarded to the callee with the least outstanding invocations.

--- Want to back this issue? **[Post a bounty on it!](https://www.bountysource.com/issues/30148702-least-busy-invocation-policy?utm_campaign=plugin&utm_content=tracker%2F462544&utm_medium=issues&utm_source=github)** We accept bounties via [Bountysource](https://www.bountysource.com/?utm_campaign=plugin&utm_content=tracker%2F462544&utm_medium=issues&utm_source=github).

oberstet commented 8 years ago

When there are multiple callees with a least number of outstanding invocations, a tie breaker is needed. This could be

random
round-robin

If this is to be exposed, we could have:

leastbusy_random
leastbusy_roundrobin

oberstet commented 8 years ago

Regarding mode naming: we probably want to have yet another mode: "quickest reaction time".

Above "least busy" is for maximizing throughput / for situations where the numbers of calls is large. But imagine callees of different speed, a user might want to direct calls to the "quickest callees" first ..

oberstet commented 8 years ago

After thinking about a little, I think it makes more sense to stick with a single leastbusy policy and leave the tie-breaking mechanism as an implementation detail.

markope commented 8 years ago

+1 to your last comment

meejah commented 8 years ago

If we tracked the lastest response-time for each observer, this could also provide useful metrics. We could the store observers in some sorted structure (e.g. heapq sorted by latest-response-time) and just select the front of the list every time.

The resulting behavior would basically just pound the fastest component until it started to get slow ... which I guess is kind of the point? (That's just my hand-wave-y guess though :) )

Keeping the observers sorted might be more pain than it's worth though; O(n) isn't that bad if n is usually only "a few"...

oberstet commented 8 years ago

@meejah

1) Rgd O(n) .. yeah, agreed. Linear search is likely faster for small n. Sharing registrations between 1000s of callees is an unlikely use case. A hundred callees should be possible though.

2) Tracking response times: definitely valuable in general!

3) heapq: Yeah, I was also thinking about some priority queue .. things is, if the value sorted by would be the number of outstanding invocations, that obviously changes.

The Py docs have this to say: https://docs.python.org/2/library/heapq.html#priority-queue-implementation-notes

"If the priority of a task changes, how do you move it to a new position in the heap?"

That's exactly my question;) Where is the answer? Does heapq support changing prios without removing elements?

oberstet commented 8 years ago

I think both invocation policies would make sense:

leastbusy => direct invocation to callee with the least number of outstanding invocations
quickestresponse => direct invocation to callee with "quickest response"

With the latter, the question is: what is the quickest response. Only the last one? Average over last N?

mfojtak commented 8 years ago

There are so many metrics that might be taken into account in load balancing scenarios. It is also very application specific. Therefore I suggest the following: Custom Invocation Policy - the decision of which callee to pick would be delegated to a registered procedure that accepts a list of callees(session id's) and returns one of them. Such a procedure would implement a user defined decision algorithm. For example - in my application, all nodes register a procedure - cpuUsage. Now I would like to query this value in my algorithm and base my decision on it. This would not be possible with "hardcoded" invocation policies. You just can't make them generic enough.

sehoffmann commented 8 years ago

Definitely agree with @mfojtak. I actually wanted to suggest something similiar on my own. I guess the best solution would be to have crossbar come with some basic "stock" algorithms which provide you with a simple solution and the option to customize the behaviour for more sophisticated load balancing.

It's just important IMO, that such code for customization actually runs on the router itself. Doing an extra roundtrip per CALL does'nt make too much sense for this scenario.

One last thing: I already mentioned this a bit in the issue regarding INVOKE-ALL. An extra loadbalancing feature is IMO not even neccessary if we have INVOKE-ALL with calle black/whitelisting. It can actually be implemented with that feature alone. For this you just run an embedded component on your router which exposes the interface to be called by peers. If we get called, we actually issue an INVOKE-ALL call with only single peer whitelisted, determined by our loadbalancing method. So INVOKE-ALL would actually just be used in order to be able to register multiple endpoints on the same URI.

Depending on how INVOKE-ALL gets implemented, we could maybe even call e.g two peers (or all) at the same time and just use whatever result we get back first. Now that's what I call a real quickestresponse, even if extremly inefficient. But hey, if you are in a hurry? ¯_(ツ)_/¯

oberstet commented 8 years ago

For this feature, the load-balancing decision will be made by the router in a generic way.

The router has two metrics about a callee it can observe from the outside: the number of outstanding invocations, and the response times of invocations.

In the future, we might add the option to have a callee return additional load information in the return from an invocation. This might then be used by the router to adjust the load balancing.

There is no callee black-/whitelisting, and even if there was, it wouldn't make sense to implement load-balancing on top of that. That would put balancing logic into the caller - and the caller should not be aware of that or coupled to this aspect.

The invoke-all policy needs to resolve another question: are we going to do any kind of "aggregation" of results in the router (practically, that could only mean creating a list of individual results) or are we using progressive results to forward to the caller?

sehoffmann commented 8 years ago

Unrelated sidenote regarding INVOKE-ALL: I can just repeat myself: I strongly opt for calle black/whitelisting for INVOKE-ALL. It's useful and consistent in design with subscriber black/whitelisting.

oberstet commented 8 years ago

callee black/whitelisting being useful: for what?

sehoffmann commented 8 years ago

I've already posted that in the original issue regarding INVOKE-ALL but I don't mind repeating it:

What we would probably use it for would be something which you could call telemetric. For example, lets imagine that we are interessted in the cpu usage of a group of peers (e.g backend servers).

So, INVOKE-ALL is clearly really convenient for that. Every server registers a backend.get_cpu_usage endpoint and if we want to query it, to e.g display it on a website, we just call backend.get_cpu_usage and aggregate all results. But what if we have many many servers (lets say 100+) and our website has(/wants) to use pagination? Querying the cpu usage of every server makes no sense at all and just produces unneccessary traffic if we only display 20 per page anyway. Likewise, it doesn't make sense to query/call every server, if we are only interessted in the cpu usage of a single server, to e.g display it on a details page. That's where calle whitelisting comes in handy.

mfojtak commented 8 years ago

I am still convinced that a load ballancing algorithm should be part of application business logic. Even if you implement dozen of various invocation policies, none of them might fit your application needs.

Actually, I am also not convinced that it is a good idea to have multi registrations at all. It brings a little bit of chaos and inconvenience into the otherwise brilliant Wamp protocol.

What my application does is that my "wokers" follow a ceratin uri pattern. My load ballancing procedure then uses those registrations and query them for cpu usage. It then picks one of the "workers" (or registrations) and makes the call to the least busy one. It gives me a flexibility to implement my own load ballancing algorithm.

If your application also utilizes meta functionality (define/describe) then it's even better and cleaner. Your load ballancing "head" procedure might query router to get descriptions of all procedures. The procedures that "implement" the fuctionality that I want to load ballance would be taken into account by the load ballancing algorithm.

Btw., in the multi registration scenario - how does the caller makes a call to the concrete callee? It is not possible I guess, which brings an inconsistency into the protocol in my opinion.

sehoffmann commented 8 years ago

@mfojtak Regarding INVOKE-ALL: This has actually arisen from an architectual issue: https://github.com/crossbario/crossbar/issues/479 , besides that there are also many usecases for this on application level as well IMO. As I already stated above, it should definitely be possible to select the desired callees, that's why I am advocating callee black/whitelisting, which could be designed in a similiar fashion to subscriber black/whitelisting, so much.

I definitely agree with you on the point that the application should be able to control loadbalancing on its own, howsoever that's realized. I completely share your opinion regarding that. Although, to have some basic load-balancing out of the box, is definitely nice as well. I don't think that these two features are conflicting with eachother.

mfojtak commented 8 years ago

https://github.com/Paranaix I agree that the invocation policies don't conflict with anything. If the crossbar team is eager on investing an effort into it then why not.

My recommendation states though - don't use multi registrations. Why? Because you register procedures that are not addressable. You cannot call them. What does it do with those things like procedure.on_registered? What was registered? What's the uri? Define/describe - how can I use it if I don't have unique uri? And I believe that if I went more into the detail I would find more inconsistencies caused by multi registrations. Uri should be unique in my opinion.

Btw., how is your black/white listing working? So if I have multi registration - I blacklist all callees but one which I want to call? I blacklist list them for all possible callers? That probably wouldn't work. But I guess that I am missing details of your proposal. Shed some light if you can, please.

sehoffmann commented 8 years ago

@mfojtak To clarify: With non-conflicting i was actually refering to custom loadbalancing vs out-of-the-box loadbalancing.

I currently fail to see your concerns tbh. Atleast how I imagine INVOKE-ALL it's basically pub-sub with additional return values. As far as I know however, there hasn't been made any progress on this yet (not completely sure about this) so this all has yet to be specced out. Regarding how I imagine how you would use it: You would call it like any other endpoint as well but instead of a single result you would get back an aggregated list/map of results. Atleast that would be one possibility. The other option would be to progessively receive results from multiple callees. Apparently both are useful.

Black/Whitelisting would be identical to pub-sub, so e.g: session.call_all(u"some.uri", someArg, options=CallAllOptions(eligble=[12412, 34172])) would only invoke callees 12412 and 34172; and session.call_all(u"some.uri", someArg, options=CallAllOptions(exclude=[12412, 34172])) would call every registered endpoint except those two.

An outstanding question is whether to design this as a seperate feature (e.g as indicated above) or to just extend the functionality of REGISTER and CALL.

mfojtak commented 8 years ago

@Paranaix Regarding the description of your proposal

Instead of session.call_all(u"some.uri", someArg, options=CallAllOptions(eligble=[12412, 34172]))

Isn't it better to do:

foreach(callee c in [12412, 34172])
{
      c.call("some.uri", args);
}

The problem is that you cannot make a call to a concrete callee having it's session id. The wamp call operation accepts only method uri. Which is not unique and therefore, the concrete callee is not callable. Solution is to refine the wamp call operation and allow the caller to specify session id via Options element. So instead of:

[48, 7814135, {}, "com.myapp.ping"]

It would be: [48, 7814135, {"sessionId": 123456}, "com.myapp.ping"]

That would give users a chance to call a concrete callee in multi-registration scenario. Which at the end of a day implicates possibility for custom invocation aka load ballancer aka head procedure, whatever you want to call it.

But I would still insist on revision of the multi-registration concept in general. It looks to me at this point that multi registrations violate the elegance of wamp protocol and bring unnecessary complexity and problems. You may want to add a new handy feature in wamp and you will find out that bringing it in is much more complicated because you have to take multi-reg into account.

I am still failing to find at least one example use case for multi registrations. You can still achieve whatever you want if you keep you registration uris unique.

Multi registrations look more pain than gain to me at this point.

sehoffmann commented 8 years ago

You realize that there is no difference between what you suggested and what i proposed except that I would transmit a list and you a single value? Also notice, that the code i wrote above is consistent/identical to how pub-sub whitelisting is currently implemented. Even more importantly, your code doesn't even work (good luck trying to call a method on an integer). I don't really see the point of this.

mfojtak commented 8 years ago

The foreach statement is just a pseudo code - not meant to be treated literally. The point of this is that you suggest a batch when the single atomic call cannot be made yet. I am proposing a real world enhancement on how to make a call at least to one callee in multi reg scenario. How does your construct translate into the protocol? Also, what is the point of the exclude thing. If I don't want to call a procedure, I just don't make that call. Besides, there is a batch transport mechanism in wamp already.

Anyway, we are getting slightly off topic.

It looks to me that multi-registrations is a crossbar specific thing. I believe that load ballancing, failover or hot standby scenarios can be implemented without using multi-reg.

Only wanted to suggest that implementing custom invocation policy mechanism once might be more useful than keep implementing dozens of predefined ones.

sehoffmann commented 8 years ago

No sorry, I still don't see your point. What's preventing you from only whitelisting a single peer?

Yes, you don't need this for loadbalancing. I was just suggesting that it would be possible to realize loadbalancing with it though. It's a general-purpose solution to a specific problem. By no mean, I want to advocate this as the holy solution for everything, I just wanted to state that it is a possibility. In the end it's a question on how much time you want to invest in what features.

oberstet commented 8 years ago

I think there might be some confusion;)

First, one can't implement what Crossbar.io provides with shared registrations using client-side means - even if there would be a way to issue a call to a specific callee session. One reason is that only the router has a complete view of the whole system - that is all callees that registered a procedure. Another one is: putting LB logic into the client is wrong IMO. It introduces strong coupling.

I also think what @Paranaix probably has in mind is about symmetry: having black-/whitelisting for RPC too, as there is for PubSub.

A prerequisite for this to be meaningful is the "all" call policy. Since if there is only exactly one callee being invoked for a given call (like it's the case with shared registrations), there is little point in black-/whitelisting this single callee. On the other hand, if a call is forwarded to "all" callees, then black-/whitelisting would be meaningful. However, I am not convinced it would be useful too. But I might change my mind about that;) But:

The "all" call policy in turn is a piece in a bigger picture: "sharded registrations / calls". Note the difference between "shared" (a single call is routed to exactly one callee - the policy determines which one) versus "sharded" (a single call is routed to multiple callees in parallel - the "all" policy, or a single call is routed to the callee that maintains the data partition / shard that should be effected by the call).

All of this is only slightly related to the issue at hand: having a "least busy" invocation policy for "shared registrations". This feature is a straight forward extension of what we already have.

And the need arose from a concrete application that is using shared registrations. That same application triggered https://github.com/crossbario/crossbar/issues/621 as well. And that is actually the more pressing issue.

crossbario / crossbar

Least-busy invocation policy #606