reTHINK-project / core-framework

The main goal of WP3 is to provide the reTHINK core framework comprised by the runtime environment where Hyperties are executed and the messaging nodes used to support messages exchange between Hyperties.
Apache License 2.0
1 stars 0 forks source link

Messaging Node should be tolerant to unstable connections #15

Open pchainho opened 9 years ago

pchainho commented 9 years ago

When connections to Messaging Node are resumed from a short disconnected period of time or when client IP address changes eg due to access network handover (eg wifi to LTE), should have no impact on the client side service.

rjflp commented 9 years ago

This requirement should be further detailed. What is the goal?

When the client's IP address changes, unless an IP mobility scheme is used, any TCP connection will break and a new one will have to be established. How long it will take, depends on whether the client is aware of the IP change or if it has to wait for a timeout. After a timeout, the client can establish a new connection a continue if there is a way to identify himself.

Is this all there is to this requirement? Being able to create a new connection and continue from where we left of? Or is there something else?

pchainho commented 9 years ago

There is something else: at the end the idea is to keep the same (Messaging Channel) session even if a new TCP connection needs to be established.

This feature is also useful to support session mobility between network interfaces and also between devices

rjflp commented 9 years ago

So this is basically being able to "login" using an ID or token or something else, in order to continue from where we left off. It is the same whether we are moving to a new network or a new device.

But the problem remains: how to find out that the IP changed without having to wait for a timeout, which may take almost a minute.

pchainho commented 9 years ago

I would say login would not be needed, all Messages would be just sent with a valid token associated to the session.

I guess this can be done using some heartbeat mechanism or similar at session level skipping this procedures from the service logic.. Let's see what are the solutions out there ...

rebecca-copeland commented 9 years ago

Hi There, Messaging/signalling servers do not need user logins, of course. I think that 'unstable connections' means something else - it means that the line is unstable, i.e. the Internet connection can be severed during the session - since it is carried over the open Internet that is only 'best effort'. The signalling server always has to stay stateful for the session, in order to maintain consistent dialogue. For this requirement to be fulfilled, there must be some functionality that enables fast recovery of sessions when the 'line' goes down, so the IP addresses, sequence numbering of the dialogue, previous ACK/NACK responses and so on should be dynamically stored... My Best  Rebecca Copeland

  From: pchainho <notifications@github.com>

To: reTHINK-project/core-framework core-framework@noreply.github.com Sent: Tuesday, 12 May 2015, 19:42 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

I would say login would not be needed, all Messages would be just sent with a valid token associated to the session.I guess this can be done using some heartbeat mechanism or similar at session level skipping this procedures from the service logic.. Let's see what are the solutions out there ...— Reply to this email directly or view it on GitHub.

rebecca-copeland commented 9 years ago

Telco signalling should be able to note that there is no response during session signalling dialogue and time out sooner. This should force a refresh function, provided that the signalling status is 'remembered'. The Internet connection can be wobbly for many reason, not necessarily a change of IP address. If a new IP address is required, it is not safe to do this without re-registering. In Telco signalling (and SIP), handsets re-register periodically anyway, even if everything is fine.

This re-registering action is one reason why I questioned the ability of IdP to do this reliably - and why no CSP would wish to depend on a third party! I raised this issues from the very first architecture network diagram.

We also need to remember that this is NOT a Telco network, and there is no cell handover! This is actually 'fixed wireless' communication - if the device moves out of range, the session is no longer valid. Some Telecom press call it: 'WiFi Calling'. The clever bit is to do it well enough despite using the Internet best effort network!... Skype have been doing a pretty good job of that (so it is feasible), but their signalling is proprietary and undisclosed.

... To my mind, this is one of the biggest challenges of reTHINK.. it is the signalling server - not signalling NODE, and not signalling GATEWAY (protocol-on-the-fly)... but the signalling software that occurs via the endpoint and the network to the other endpoint...

I also have some concerns about saying that this is performed entirely between peers - because you need the re-registration. This is another big challenge for reTHINK!

In our reTHINK solution, we improve on the media flow via the media gateway (TURN) routing, which is already limiting the routing to go only where such gateways are installed... but the signalling should remain over the open Internet, at least before initiating media flow, in order to retain the main advantage - global reach. This is why I question the idea of a 'messaging layer' made of specialised messaging nodes! This should be the last resort - first try to do this on the open Internet, by means of fast recovery.

Signalling DURING the session (fast recovery) can run on the TURN network, along the same path of the media - this may provide some resilience. I am not sure exactly what such TURN gateways are capable of... This means that there is no messaging layer, but there is session initiation signalling on open Internet (for global reach) by the 'signalling server' on the CSP platform, and then mid-session signalling co-located on the media layer (for in-session resilience). Note that passing control to the endpoint to refresh signalling is asking for trouble - it is very easy to subvert. Personally, I would advise against it.

What do you think?

My Best

Rebecca

On 12 May 2015 at 15:29, rjflp notifications@github.com wrote:

So this is basically being able to "login" using an ID or token or something else, in order to continue from where we left off. It is the same whether we are moving to a new network or a new device.

But the problem remains: how to find out that the IP changed without having to wait for a timeout, which may take almost a minute.

— Reply to this email directly or view it on GitHub https://github.com/reTHINK-project/core-framework/issues/15#issuecomment-101301188 .

rebecca-copeland commented 9 years ago

Hi Paulo,  I am not sure how do you preserve a 'messaging channel' if the Internet connection is unstable? How would you realign it with a new IP address, without all the checks of a new service request? Handing over a Telco session due to mobility is notoriously difficult. IMS has a special service - VCC (Voice/Video Call Continuity). Admittedly, it is harder to retain session settings (same QoS, charging...) as well as avoiding any blips in the media, when two carriers are involved and the media has to be changed over to flow through different media gateways. On the Internet, if a single CSP is used for both ends, it is a lot easier - just a refresh, reaffirming the session particulars. If two CSPs are involved, there has to be a special signalling dialogue for a restart of on-going session.  Some years ago we demonstrated handing over of a session between devices, e.g. video-streaming onto the TV. However, this also has some serious security issues - e.g. hijacking the session by a hacker to a device that belongs to someone else - great for illegal watching of real-time football, for example. The demand for it, compared with the cost of doing it securely, were questioned. If a new IP address is needed due to the endpoint moving, it may be impossible (and unwise) to continue without rechecking the session parameters, including the authentication. In Telco networks, the media will continue to flow even if the signalling is severed, and I guess it is the same on the Internet. If the IP address is the same, you do not need to disturb the media. However, most TURN GW are also NAT/PAT, and they will 'object' to an unannounced change of IP address! So, I think that if the device moves out of range, a new IP address has to be allocated and a new session has to be established - Fixed Wireless style, not fully mobile. This has some implications to a file download to avoid having to start from the beginning (which most FTP vendors have already resolved). It will also cause a gap/stutter in Speech and Video streaming, but this is unavoidable. I am guessing that SKYPE overcomes some of these problems by re-analysing the session buffers and sending some packages (possibly re-sequencing them too), despite losses and delays. This is the kind of 'slur' in speech that we experience when the Internet connection is unstable. It is part of a mind-set of 'good-enough' versus 'guaranteed QoS'. BTW - Clever buffering techniques is a good area for innovative (and openly published) research - or for a software differentiation for CSPs.  My Best  Rebecca Copeland

  From: pchainho <notifications@github.com>

To: reTHINK-project/core-framework core-framework@noreply.github.com Sent: Tuesday, 12 May 2015, 15:12 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

There is something else: at the end the idea is to keep the same (Messaging Channel) session even if a new TCP connection needs to be established.This feature is also useful to support session mobility between network interfaces and also between devices— Reply to this email directly or view it on GitHub.

pchainho commented 9 years ago

This requirement is about the "messaging channel session" (we have to find a good term for this) ie the session that is associated with the channel that is established between the end-user device and the messaging server. Something similar to SIP Registration procedure but that should be as much as possible independent from the IP/TCP procedures.

In my perspective, as soon as the security token associated with the Messaging Channel is valid, messages can be exchanged independently of what happens with connectivity. But opinions from partners more experts in security is needed here. I could imagine, that stronger security policies may imply token revalidation in certain connectivity situations like IP address change but I would prefer to have it flexible and set by security policies from the Service Provides managing the Messaging Server (eg the CSP). Eg It should be possible to use Identities from external IdP and but to use authorisation policies set by the CSP.

In my opinion we should be able to handle communication handover between different access networks in non-manageable networks eg between WIFI and LTE.

I agree that the decision to "refresh signaling" (revalidation of token) should be enforced by the messaging server according to applicable authorisation policies.

I'm just afraid that we might be talking about two different sessions:

So far I've been talking about the first one (messaging channel session)

Peer communication session is another topic that probably deserves another requirement but I would say, the same principles should be applied with the big difference that we may have several CSPs involved in a single communication session with different and, perhaps, conflicting policies. In such situations and as a principle, I would suggest to prevail policies from the communication (or conversation) host (ie the communication peer that is providing the messaging server).

The usage of "specialised network services" (with TURN servers according to ORANGE) to support peer communication are not mandatory and, in my perspective, they are separated from the "messaging layer" and only used for each communication session between peers (calls). In case communication policies imply full control of the peer communication (not just beacuse of QoS purposes eg because the payment model is pre-paid or lawful communication interception) it might to route the media through the "specialised network services".

I suspect the change of IP address during peer communication session will mandate new negotiation between peers with or without the usage of specialised net services. However I think the need to re-authenticate will depend on applicable authorisation policies (security experts your opinion is needed here!).

antonroman commented 9 years ago

Hello,

sorry in advance for this comment as it is not directly related to the issue.

Perhaps we should consider the use of QUIC as transport protocol as it fits pretty well in this requirement. QUIC relies on a 64 bit Connection ID instead of IP addresses-port tuples to identify the connection at transport level. Connections can be kept open even if the client IP changes. This feature was added to the protocol bearing mobility environments in mind. Additionally, QUIC enforces encryption and the number of RTTs to create/resume a connection is minimized.

QUIC protocol is a "transport layer" over UDP and it is currently being used by Chrome to interact with Google Apps (I don't know if it's currently used by more services). HTTP/2 is going to become a definitive RFC in a few months and it's used in production by many relevant Internet companies. HTTP/2 "works" much better over QUIC so it is also very likely to be adopted by the IETF in short-term.

So including the use of QUIC as a requirement or a recommendation could help to support more reliably mobility scenarios. The transport layer connectivity it provides is more suitable for wireless connections (longer RTTs, packet lost and changes at IP level) than TCP. Having said that, using QUIC would add a technical overhead to the implementations.

During a media session, the change of an IP requires an SDP re-negotiation when a media sessions is ongoing, so we can't leverage QUIC features for media. However QUIC would be helpful in all the scenarios at signaling level.

BR

References:

rebecca-copeland commented 9 years ago

Hi,

This sounds good, though it is more suited to browsing (with short bursts of media) than to - say - media streaming or conversation. But it is encouraging that mobility issues are being looked at.

My Best

Rebecca

On 18 May 2015 at 15:34, Anton Roman notifications@github.com wrote:

Hello,

sorry in advance for this comment as it is not directly related to the issue.

Perhaps we should consider the use of QUIC as transport protocol as it fits pretty well in this requirement. QUIC relies on a 64 bit Connection ID instead of IP addresses-port tuples to identify the connection at transport level. Connections can be kept open even if the client IP changes. This feature was added to the protocol bearing mobility environments in mind. Additionally, QUIC enforces encryption and the number of RTTs to create/resume a connection is minimized.

QUIC protocol is a "transport layer" over UDP and it is currently being used by Chrome to interact with Google Apps (I don't know if it's currently used by more services). HTTP/2, which is going to become a definitive RFC in a few months and it's used in production by many relevant Internet companies. HTTP/2 "works" much better over QUIC so it is also very likely to be adopted by the IETF in short-term.

So including the use of QUIC as a requirement or a recommendation could help to support more reliably mobility scenarios. The transport layer connectivity it provides is more suitable for wireless connections (longer RTTs, packet lost and changes at IP level) than TCP. Having said that, using QUIC would add a technical overhead to the implementations.

During a media session, the change of an IP requires an SDP re-negotiation when a media sessions is ongoing, so we can't leverage QUIC features for media. However QUIC would be helpful in all the scenarios at signaling level.

BR

References:

— Reply to this email directly or view it on GitHub https://github.com/reTHINK-project/core-framework/issues/15#issuecomment-103080803 .

pchainho commented 9 years ago

Great!! @antonroman could you contribute with a SOTA about QUIC including an analysis on how reTHINK can consider it?

rebecca-copeland commented 9 years ago

My thoughts in line below.  My Best  Rebecca Copeland

  From: pchainho <notifications@github.com>

To: reTHINK-project/core-framework core-framework@noreply.github.com Cc: Rebecca Copeland rebecca.copeland@coreviewpoint.com Sent: Monday, 18 May 2015, 11:20 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

This requirement is about the "messaging channel session" (we have to find a good term for this) ie the session that is associated with the channel that is established between the end-user device and the messaging server. Something similar to SIP Registration procedure but that should be as much as possible independent from the IP/TCP procedures.What makes it a 'channel' that is independent of IP/TCP? SIP runs over IP, but I do not think that it creates channels... It maintains a dialogue, so that responses are sequenced correctly. Is this going to be a new SIP? In my perspective, as soon as the security token associated with the Messaging Channel is valid, messages can be exchanged independently of what happens with connectivity. But opinions from partners more experts in security is needed here. I could imagine, that stronger security policies may imply token revalidation in certain connectivity situations like IP address change but I would prefer to have it flexible and set by security policies from the Service Provides managing the Messaging Server (eg the CSP). Eg It should be possible to use Identities from external IdP and but to use authorisation policies set by the CSP.Stronger authentication means different procedures, different credentials and keys, mutual authentication (server to user, not just user to server), etc. Stronger security beyond authentication may involve different levels of encryption. Security requirements should consider both parties, not just the handling CSP, so this needs to be negotiated at session initiation... but this requires CSP-CSP signalling, which you want to avoid.
In my opinion we should be able to handle communication handover between different access networks in non-manageable networks eg between WIFI and LTE. That will be interesting. You mean a peer-to-peer 'Internet offload' that occurs independently of what MNOs are already doing - or in spite of the carrier's decision? That will create havoc! Also, what decision making process is there to decide WHEN to do it, what triggers it? ...and if this is still peer-to-peer, how do you propose to handle the charging implications on-the-fly? I agree that the decision to "refresh signaling" (revalidation of token) should be enforced by the messaging server according to applicable authorisation policies.There are two different processes here. One is 'capping' that is handled at the application level, triggered by usage monitoring, and another is  'refreshing signalling', which is NOT handled by the application layer, but in the signalling layer. Refreshing is about confirming that the on-going session is still running between the correct endpoints etc., especially for long streaming sessions, and is also used for mid-session CDRs... well - this is in the world of IMS and SIP... I'm just afraid that we might be talking about two different sessions:

— Reply to this email directly or view it on GitHub.

antonroman commented 9 years ago

@pchainho I added an analysis of QUIC to the SOTA folder: https://github.com/reTHINK-project/core-framework/blob/master/docs/sota/quic.md

Regarding the rest of topics discussed in the issue, I include a list with some proposals and things to be considered:

  1. we could use the terms "messaging service registration"(*) to refer the mentioned "messaging channel session". Through this registration the Messaging Service should be aware of that a End User Service is connected to it and there will exist a mechanism to send bi-direcctional asynchronous notificacions. This session is not to create a multimedia/data communication but a way to authenticate both sides and let the messaging server know where to send notifications.
  2. we could use the term "communication session" to refer to the set of signaling messages and the media or data streams exchanged by peers connected to Messaging services.
    1. within each "messaging service registration" a "HTTP session cookie"-like mechanism could be used to authenticate all the subsequents requests. This would make sessions independent from lower connectivity levels. Risk associated with cookies must be considered: http://resources.infosecinstitute.com/risk-associated-cookies/
  3. The mechanism above enforces the use of encryption (TLS or QUIC) otherwise if someone gets the cookie, the End-User device could be easily impersonated.
  4. When the End-user device authenticates itself during the registration it should relay on third-party IdP. WebRTC includes a mechanism to provide authentication based on Third-Party Identity Assertions of the SDPs. This matches pretty wel in the Signaling on the fly paradigm since we'll always send an SDP when we contact the messaging service of the End-user being called. The IdP WebRTC API is still not available in the browsers AFAIK so we are likely not to be able to use in this project. On the other side, to authenticate the End-User services against the messaging node this alternative is not valid (we don't have SDP), but we can find an equivalent approach. I would use OpenID or a solution based on OAUth 2.0. This will mean that we have to use HTTP as the application protocol but I guess that this is something that we have already assumed, haven't we?
  5. All the messaging service nodes should have a certificate trustable by all the End User and Network Services. We could consider other mechanism to provide authentication, in fact I would be happy to learn other alternatives to do this.
  6. What happens when an IP changes in mobility scenarios?: A. At media level: DTLS requires a re-negotiation. The "security link" between the signaling and the DTLS-SRTP media session is a fingerprint in the SDP. We can consider that the media session is going to be be always secure and we can consider it will be also authenticated if the signaling channel which transmits the SDP is secure and authenticated. In short, the media is confidential by default and its authentication depends on the signaling.

    B. At signaling level: after an IP change TLS will require a new complete TLS handshake. The server side authentication is provided by the server certificate. The client authentication can be provided at session layer by a client certificate (leveraging muthual authentication of TLS) or at application level re-using the "messaging service registration" cookie obtained in the initial registration.

  7. Using of QUIC would make easier to handle IP changes at signaling level and it would reduce latency in registrations and signaling messages. The downside is that it will require either to increase the development hours or reducing the complexity of the use cases to be implemented. Maybe we could implement some tinny example using QUIC to check if it offers real-advantages over TCP. On the other side, it would add an additional innovative touch to the project.
  8. Regarding the secutiry policies to be used in a "communication session" we should define two categories: mandatory and preferred. A. if any participant does not support a mandatory policy then the communcation can not be started. B. if any participant supports a preferred policy then it must use it, if not the call can be started anyway. What I would not do is to force policies of the messaging server of the called participant as they can be less restrictive than the policies of the calling party.

(*) I would use the term subscription instead of registration if the entity which subscribes can only receive notifications (e.g. mobile devices send subscriptions to the Push Notification services of Android and Apple), but in this case I understand that the entity which subscribes may also use that channel to send messages to the messaging server.

rebecca-copeland commented 9 years ago

Anton et al, We should stick with common terms where possible. In communication circles:

please note that we should distinguish between the web apps processes and the communication services.

These terms should be added to the architecture document (my task), but please - help to improve on the above definitions. My Best  Rebecca Copeland

  From: Anton Roman <notifications@github.com>

To: reTHINK-project/core-framework core-framework@noreply.github.com Cc: Rebecca Copeland rebecca.copeland@coreviewpoint.com Sent: Monday, 25 May 2015, 16:30 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

@pchainho I added an analysis of QUIC to the SOTA folder: https://github.com/reTHINK-project/core-framework/blob/master/docs/sota/quic.mdRegarding the rest of topics discussed in the issue, I include a list with some proposals and things to be considered:

AnastasiusGavras commented 9 years ago

Hello Rebecca, all

On the definition of terms, I was wondering if a separation is needed for “customer” vs. “subscriber”

The customer is associated with the contract (signatory to it), including agreement on fees. The subscriber is associated with credentials, preferences, etc.

In the retail and consumer markets, the customer is often the subscriber. However this is not the case when the customer is, for example, a company and the subscribers are the employees of the company, that are entitled to use the service.

Best

Tasos

From: Rebecca Copeland [mailto:notifications@github.com] Sent: Tuesday, May 26, 2015 12:59 PM To: reTHINK-project/core-framework Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

Anton et al, We should stick with common terms where possible. In communication circles:

please note that we should distinguish between the web apps processes and the communication services.

These terms should be added to the architecture document (my task), but please - help to improve on the above definitions. My Best Rebecca Copeland

From: Anton Roman notifications@github.com<mailto:notifications@github.com> To: reTHINK-project/core-framework core-framework@noreply.github.com<mailto:core-framework@noreply.github.com> Cc: Rebecca Copeland rebecca.copeland@coreviewpoint.com<mailto:rebecca.copeland@coreviewpoint.com> Sent: Monday, 25 May 2015, 16:30 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

@pchainho I added an analysis of QUIC to the SOTA folder: https://github.com/reTHINK-project/core-framework/blob/master/docs/sota/quic.mdRegarding the rest of topics discussed in the issue, I include a list with some proposals and things to be considered:

— Reply to this email directly or view it on GitHubhttps://github.com/reTHINK-project/core-framework/issues/15#issuecomment-105488578.

ingofriese commented 9 years ago

one thing should be said: We use somehow "old" Telco world wording if we talk about subscription (in the web it is registration) and registration (is in the web world rather login). We can do this with a reason.....but to me web wording would be more appropreate, because we simply reuse web IdM technologies. Thoughts?

rebecca-copeland commented 9 years ago

Hi Ingo,

The issue of Telecom words versus Internet words applies to the whole project. The same wording choice is also when talking about signalling versus messaging, for example... On one hand, we are dealing with real-time communication - not what the Internet was originally built for, and on the other - we deal with users coming to it from web applications, not what communication software expects (it is the other way around in Telecom).

What I suggested was using terms from SIP, which has its roots in Internet Voice - not Telecom, but has been used by now to describe all the necessary terms in packet based communications... and it is now being deployed as the main communication protocol in both fixed and mobile systems - so not 'old'.

The language you propose does not cover what is needed for full real-time communications. For example: If we use the word login (not register), how would you describe re-registration (for mobility) - re-login? 'Login' still means something else - the user's act of registering to the network service - which on the Internet is a user action of clicking and entering password, but in Telecom, registering is an automatic process when the device is switched on, and re-registering is also automatic... We are merging two worlds not only in words, but also in concepts. You may invent new words - indeed, this project has done so already - but personally I prefer to stay with accepted terms, if possible.

This said, the reTHINK documentation should make this clear.  So? Which terminology? Decisions please...

My Best  Rebecca Copeland

  From: Ingo Friese <notifications@github.com>

To: reTHINK-project/core-framework core-framework@noreply.github.com Cc: Rebecca Copeland rebecca.copeland@coreviewpoint.com Sent: Thursday, 28 May 2015, 9:31 Subject: Re: [core-framework] Messaging Node should be tolerant to unstable connections (#15)

one thing should be said: We use somehow "old" Telco world wording if we talk about subscription (in the web it is registration) and registration (is in the web world rather login). We can do this with a reason.....but to me web wording would be more appropreate, because we simply reuse web IdM technologies. Thoughts?— Reply to this email directly or view it on GitHub.

pchainho commented 9 years ago

Hi

Can we move this terminology discussion to a WP2 issue, pls?

In the meanwhile I'm working on a definition of terms specific to WP3 and also on the description of some procedures taking into account some input from this thread

cheers