Azure / azure-relay

☁️ Azure Relay service issue tracking and samples
https://azure.microsoft.com/services/service-bus
MIT License
86 stars 84 forks source link

Scale #10

Open jtaubensee opened 7 years ago

jtaubensee commented 7 years ago

@dstuckims - Can you provide some bullet points as an overview of what should be covered?

jtaubensee commented 7 years ago

Maybe this is more of a load balancing article... We should mention the that you can have up to 25 listeners, and here is how load balancing works:

When a rendezvous needs to occur (a new client connection is created/opened):

  1. Get a local copy of the list of all the known load-balanced listeners for the address requested by the sender (this comes from a cache which is updated every 500ms).
  2. If the list of listeners is empty and we haven't refreshed the list of listeners exactly once force refresh the list of known load-balanced listeners for the endpoint.
  3. If the list of listeners is empty return an exception to the sender and stop.
  4. Pick a random index into the list of potential listeners.
  5. Try to rendezvous with the selected listener.
  6. If that rendezvous succeeds then stop.
  7. If the rendezvous attempt with the selected listener doesn’t succeed within 10 seconds remove the selected listener from the list of listeners to try.
  8. If more than 60 seconds have passed return an exception to the sender.
  9. Go to step 2.

Ultimately it will try the list of known listeners twice before giving up. The next listener to attempt is picked using a random index into the list.

@sethmanheim - Do you think that this can be distilled into an article?

sethmanheim commented 7 years ago

@jtaubensee Possibly, but this list is pretty high level. This kind of sounds like our internal implementation, is that really what we want to document? From a user perspective, what does this mean? Are there other load balancing considerations? Is it worth including sample code?

Also, when you say "this is how load balancing works," is this for hybrid connections, WCF Relay, or anything with relays?

What happens when it "gives up" (last paragraph)? :-) An exception?

sethmanheim commented 7 years ago

https://azure.microsoft.com/en-us/blog/now-available-relay-load-balancing-for-windows-azure-service-bus/

dlstucki commented 7 years ago

The rendezvous algorithm is the same for WCF Relays and HybridConnections. The key takeaway is that each listener is picked randomly. This gives fairly even distribution across all listeners.

When it "gives up" there are several different exceptions:

jtaubensee commented 7 years ago

After a little more thought, I'm struggling to see the need for this article. We could even add the load balancing part as an FAQ item. Any objections to holding off on this one?

sethmanheim commented 7 years ago

Note that we have this, too, but it's buried inside Messaging info: https://docs.microsoft.com/en-us/azure/service-bus-messaging/service-bus-architecture#processing-of-incoming-relay-requests.

jfggdl commented 4 years ago

This is an ask to add documentation around load balancing among listeners for Hybrid Connections and WCF Relay according to the description above from David. The only place known where we have something around this topic is at https://docs.microsoft.com/en-us/azure/service-bus-relay/relay-hybrid-connections-protocol#listen-message.