Closed benlangfeld closed 10 years ago
So I'll start by outlining my understanding of how Voxeo PRISM clusters (intended) to handle this:
uri
attributes of <join>
commands, in addition to the outer IQ#to
indicating the target of the call.The inter-node bridging seems simple enough It does require consideration of the security implications of the implicit join - these calls would need to be authenticated appropriately between the nodes. I think we can realistically include that in this spec (although at a reasonably high level) and expect nodes and deployments to implement it correctly.
I don't think we can expect nodes to implement call migration; this is significantly more complex and I'm not sure there's any standardisation of it generally. It has overlaps with failover should a node die with live calls. Additionally, I am somewhat uncomfortable about the ramifications of exposing semi-explicit control of load-balancing to untrusted clients; it's conceivable that a client could sufficiently weight calls onto individual nodes beyond the control of normal load-balancing to take an individual node down, causing major cascading failure.
I'm left in favour of specifying the semantics of inter-node joins at a high level. Thoughts?
I had to solve a similar problem at my last job by using inter-node joins. It works fine. We can also reduce the need for this by using dial w/ nested join. Then, the gateway could inspect the request and put both calls on the same node.
My only other thought was there could be some kind of "call group" hint you could assign to outbound calls and that are assigned to inbound calls. Then calls in the same group could have preference to the same node (though not required).
I'd rather steer away from the "call group" / node hint idea since it marks a clear explicit DoS vector. I'd also insist on making any implicit grouping by nested joins strictly an optional (MAY) optimisation, since it'll require some more advanced load-balancing/monitoring to avoid DoS attacks.
I'll write this up today.
Written up here. Critique please, @mmcguinn / @crienzo? Have I missed anything out here? Have I been too vague? Perhaps we need to wait until this has an implementation to be sure?
I disagree about possible DoS vector. A gateway implementation would not be required to use any hints, especially if it decided there were too many calls on that server. It just seems unfortunate that most screened follow me calls will be joined on different servers.
I don't think it's in the scope of this document to require a secure channel between nodes. Are you attempting to define how two different rayo node implementations can join calls?
I'm happy for nested joins to be used as a hint for co-location on a node and included that in the text. I guess we could expand that to a general hint with the same semantics, and as you say make it optional for the gateway to comply. I'll work that in.
As for the secure channel between nodes, I mean purely that nodes should ensure that only calls from trusted nodes are allowed to behave as these bridge proxies which automatically join a real call. Implementations are free to utilise any auth they like, be it simple firewalls, SIP Digest, etc. I'll make this clearer in the text, since I suspect you believed I meant encryption.
I've addressed both of those comments. This is starting to look better. Rendered version at http://ci.mojolingo.com/job/Rayo%20Spec/283/artifact/extensions/rayo-clustering.html
Looks good to me. Michael, any comments?
Looks good to me as well, that covers the only real issue I came up with going over flows.
On Tue, Apr 22, 2014 at 4:06 PM, Chris Rienzo notifications@github.comwrote:
Look good to me. Michael, any comments?
— Reply to this email directly or view it on GitHubhttps://github.com/rayo/xmpp/pull/93#issuecomment-41088914 .
From @mmcguinn: