eclipse-californium / californium

CoAP/DTLS Java Implementation
https://www.eclipse.org/californium/
Other
724 stars 363 forks source link

Do not identify observation by token only. #173

Closed sbernard31 closed 6 years ago

sbernard31 commented 7 years ago

Currently, Observation is identify by token in observation store.

We should identify it by token + a peer identifier.

I recommend that peer identifier was :

As there is no real consensus on this, I suppose we should find a way to allow users to bring its own behavior.

sophokles73 commented 7 years ago

Ok, here's a proposal:

we introduce a new interface EndpointIdentifier which we use in place of InetSocketAddress everywhere throughout the system. We can then have multiple implementations of it, e.g. one based on InetSocketAddress when using UDP only and another one based on the DTLS client identity or session ID. We might even evolve the CorrelationContext into this concept.

WDYT?

sbernard31 commented 7 years ago

It sounds good.

sophokles73 commented 7 years ago

Just to be clear, this basically means replacing the InetAddress/port combination everywhere throughout Californium/Scandium with a source and destination EndpointIdentifier, e.g. the Message class would no longer have getSourceAddress() nor getSourcePort() methods but only a getSourceEndpointIdentifier() method which would not necessarily contain an IP address and port at all, e.g. if the transport would be SMS or NB-IoT where addressing is not IP based AFAIK ...

Still sounds like a good idea? My feeling is that we would need to change around 30 percent of the classes in the project ... don't get me wrong, I believe this is the right thing to do but it will take some time to actually do it ...

sbernard31 commented 7 years ago

I thought you just plan to change the way we identify observation not all messages. I understand the need to add a sourcepointIdentifier, but I'm not sure this is a good idea to replace data like sourceAddress/sourcePort. This seems to be valuable data.

sophokles73 commented 7 years ago

That data would still be available, but only as part of the EndpointIdentifier, i.e. you could do something like UdpEndpointIdentifier.getAddress() to retrieve the peer's IP address/port.

boaks commented 7 years ago

Though currently the observe is the trigger, I could life with reduce the scope to that for the first step. But in general I think, that it should be used later consequently.

sbernard31 commented 7 years ago

Let's think about the API.

Currently the API looks like this :

// create request
Request request = new Request(Code.GET);
request.setObserve();
request.setURI("my/path");
request.setDestination(destinationAddress)

// listen new notification
endpoint.addNotificationListener(new NotificationListener() {
    @Override
    public void onNotification(Request request, Response response) {
        System.out.println("Get new notification :" + response);
    }
});

// send request
endpoint.sendRequest(request);

// receive response which acknowledge observation relation
Response response = request.waitForResponse();

// do some stuff ...

// cancel observation
endpoint.cancelObservation(request.getToken());

How does it looks like with this new kind of observation identifier (token+endpoint identifier)

// create request
Request request = new Request(Code.GET);
request.setObserve();
request.setURI("my/path");
request.setDestination(destinationAddress)

// listen new notification
endpoint.addNotificationListener(new NotificationListener() {
    @Override
    public void onNotification(Request request, Response response) {
        System.out.println("Get new notification :" + response);
    }
});

// send request
endpoint.sendRequest(request);

// receive response which acknowledge observation relation
Response response = request.waitForResponse();

// do some stuff ...

// cancel observation :
// this one could be used when there is no authentication or if you keep a reference to the response
endpoint.cancelObservation(response.getSourceEndpointIdentifier(), request.getToken());
// or
endpoint.cancelObservation(new PSKIdentifier(pskIdentity, request.getToken());
// or
endpoint.cancelObservation(new RPKIdentifier(publicKey, request.getToken());
// or
endpoint.cancelObservation(new X509CommonNameIdentifier(commonName, request.getToken());
// or
endpoint.cancelObservation(new SocketIdentifier(peerAddress, request.getToken());

I think all of this is also linked to this issue #104. With the same idea we can imagine something like that :

// create request
Request request = new Request(Code.GET);
request.setURI("my/path");
request.setDestination(destinationAddress)

// send request
endpoint.sendRequest(new PSKIdentifier(pskIdentifiy), request); 

or

// create request
Request request = new Request(Code.GET);
request.setURI("my/path");
request.setDestinationIdentifier(new PSKIdentifier(pskIdentity));
request.setDestination(destinationAddress)

// send request
endpoint.sendRequest(request); 

Then ideally application layer should be notified if the request is not delivered because identity does not matched.

request.addMessageObserver(new MessageObserverAdapter() {
    @Override
    public void onResponse(Response response) {
    }

    @Override
    public void onDropped(String reason) {   
         // or maybe a  onDropped(Exception reason) signature
    }
});

WDYT ?

boaks commented 7 years ago

So you would add the "EndpointIdentifier" to the response?

This indicates to me, that you don't have the EndpointIdentifier when you create the request. But when creating the request, we need to reserve a token. Right? If we want to have <token + EndpointIdentifier> unique, how could we generate a token, if we don't have the EndpointIdentifier?

sophokles73 commented 7 years ago

Based on @sbernard31's comment regariding keeping the scope of this issue limited, I am currently only using CorrelationContext as an additional qualifier for identifying an endpoint. For that, I have added two properties to Message (sourceEndpoint and destinationEndpoint). When we send a request out to a peer, the Connector sets the destinationEndpoint property to the CorrelationContext instance created by the connector. For the UDPConnector I have added a UdpCorrelationContext which simply wraps the peer's InetSocketAddress. When we receive a message, the Inbox copies the CorrelationContext passed in via the RawData into the Message received. In ObservationStore I have added the CorrelationContext to the method signatures, e.g. ObservationStore.get(CorrelationContext, byte[]) instead of just ObservationStore.get(byte[]). This way we can use arbitrary types of CorrelationContext for scoping the tokens to e.g. just the peer's address (using UdpCorrelationContext) or a DTLS session (using DtlsCorrelationContext).

boaks commented 7 years ago

I still don't understand, how we can create a unique <token + EndpointIdentifier> pair, if we don't know the EndpointIdentifier. Or is it intended, that tokens are unique by them self?

sophokles73 commented 7 years ago

No, the tokens are only unique within the context of the peer endpoint. This is one of the problems we want to address in this issue. I have pushed a branch with an idea of how me might approach this. Open for comments ...

sbernard31 commented 7 years ago

Currently, the TokenProvider create unique token across all the peer endpoint. If we want to make it only unique by peer, @boaks points a real problem with the current way we do. I mean reserve then release token.

But If we finally aim unique token by peer, I think using random generation is enough (without reservation). How many exchange by peer at the same time ? 1, 5, 10 maybe 50 ? chances of collision are really slow : 50 on 2^64 = 1,844674407×10¹⁹.

sbernard31 commented 7 years ago

I looked at the branch, the whole correlationContext is used to identify the peer ? If I want to use less data to identify my observation I need to create my own ObservationStore which will clean the CorrelationContext (I mean just keeping keys which interest me )?

If I well understand, this API seems a bit less intuitive than what I propose above, but this is also less code modification. So maybe it's OK.

boaks commented 7 years ago

Or, instead of removing keys, add an additional matcher method to CorrelationContextMatcher.

sophokles73 commented 7 years ago

Currently, the TokenProvider create unique token across all the peer endpoint. If we want to make it only unique by peer, @boaks points a real problem with the current way we do. I mean reserve then release token.

I seem to have been mistaken regarding the tokens :-) We seem to have agreed some time ago that the token space would be large enough to be shared by all peers. If we only consider the problem of making sure that a notification received over DTLS can be matched with the corresponding entry in the ObservationStore on a single node then this should actually work.

However, considering the case we introduced the ObservationStore for in the first place, i.e. fail-over between nodes, the problem still seems to exist.

FMPOV we could/should defer setting the token (and maybe the MID as well) to when we have established the destination CorrelationContext. This, however, poses another problem: if the CorrelationContext can be determined at the Scandium level only (i.e. because the client wants to send an initial request) then the token will already have been set by the Matcher because the RawData structure contains the encoded message (not the Message instance) :-(. Maybe we should also get rid of the Connector interface as well and turn the connector implementations into Layers which are part of the CoapStack. That would probably also allow us to get rid of the multiple Matchers because the job encoding and decoding would be done by the (already transport specific) connectors ...

sophokles73 commented 7 years ago

Having taken a look again at InMemoryRandomTokenProvider, it seems I have not been mistaken after all :-) We are using Exchange.KeyToken instances as keys within the token map and these contain both an InetSocketAddress (of the peer) as well as the token itself. I am actually wondering whether we can't keep it like this. All we want to make sure is that tokens are not re-used with a peer before they have been released, right? With standard requests originating locally, there should be no problem. If we receive a response then the token is released, if we don't get a response (e.g. because the peer's IP address has changed) the token is released due to the request timing out. The interesting case therefore is with observations where the server's IP address changes over time. We still want to be able to release the token once the observation has been canceled. For that, we would need to keep track of the original IP address of the server which was used when the observe relation has been established in the first place. We could simply add it to the CorrelationContext that we register the Observation under in the ObservationStore. When the client later cancels the observe relation it will probably use a CorrelationContext containing the DTLS session id as the primary identifying property of the peer. However, once the Observation has been found within the store using the session ID, we can use the alos contained IP address and port to release the token. Or am I mistaken?

sophokles73 commented 7 years ago

@sbernard31,

If I well understand, this API seems a bit less intuitive than what I propose above, but this is also less code modification.

It is actually quite the same if you keep in mind that CorrelationContext is just an interface and you could e.g. have PskIdentityBasedCorrelationContext or X509BasedCorrelationContext.

In most cases you wouldn't even care about the particular type/instance because the CorrelationContext instance is provided by the connector when a message is received and is propagated up the stack by means of the sourceEndpoint property of Message.

sophokles73 commented 7 years ago

We still want to be able to release the token once the observation has been canceled. For that, we would need to keep track of the original IP address of the server which was used when the observe relation has been established in the first place. We could simply add it to the CorrelationContext that we register the Observation under in the ObservationStore. When the client later cancels the observe relation it will probably use a CorrelationContext containing the DTLS session id as the primary identifying property of the peer. However, once the Observation has been found within the store using the session ID, we can use the alos contained IP address and port to release the token. Or am I mistaken?

We are actually doing this already because we are keeping the original Request as part of the Observation as well which contains the original peer address. And we are also using this address to release the token already.

sbernard31 commented 7 years ago

Ouch, it's hard to answer to all of this.

If we receive a response then the token is released, if we don't get a response (e.g. because the peer's IP address has changed) the token is released due to the request timing out.

Not exactly, as timeout is just for ACK, not for response.

The interesting case therefore is with observations where the server's IP address changes over time. We still want to be able to release the token once the observation has been canceled.

Ok but the idea was to avoid token collision, not just to be able to release token ^^. If we use address+token, you can have the same device which can use 2 different addresses and the same token. The collision is not detected... As I explained above this kind of collision is extremely unlikely so from my point of view we can just start without this mechanism of reserve/release for TokenProvider.

For that, we would need to keep track of the original IP address of the server which was used when the observe relation has been established in the first place. We could simply add it to the CorrelationContext that we register the Observation under in the ObservationStore.

It sounds really strange to me oO.

It is actually quite the same if you keep in mind that CorrelationContext is just an interface and you could e.g. have PskIdentityBasedCorrelationContext or X509BasedCorrelationContext.

Not exactly. Currently in CorrelationContext there is a lot of information. Intuitively when you implement your observationStore you will use the whole context as endpoint key (as you described in javadoc)but most of the time we don't want to do that. So in observation store, you need to extract from this context what you want to use as key. Not so good in term of separation of concerns.

What I have in mind is that EndpointIdentifier and CorrelationContext was 2 differents concepts. The endpoint identifier is a CoAP concept. CorrelationContext is more a lower layer concept. (I even rename it as connectionContext or something like that) Californium use this CorrelationContext to extract an endpoint identifier (on receiving message) or translate an endpoint identifier in CorrelationContext(on sending message). We could define an interface for the class responsible to do this translation so anybody can use what he wants as identifier from context.

I'm not sure this is clear enough ...

sophokles73 commented 7 years ago

If we receive a response then the token is released, if we don't get a response (e.g. because the peer's IP address has changed) the token is released due to the request timing out.

Not exactly, as timeout is just for ACK, not for response.

You're right, this is exactly what @joemag1 pointed out in #73 (which we never fixed BTW).

sophokles73 commented 7 years ago

Ok but the idea was to avoid token collision, not just to be able to release token ^^. If we use address+token, you can have the same device which can use 2 different addresses and the same token. The collision is not detected... As I explained above this kind of collision is extremely unlikely so from my point of view we can just start without this mechanism of reserve/release for TokenProvider.

What kind of collision are we actually talking about and what is the scenario where we are facing a problem? Please keep in mind that we are identifying observe relations by means of the CorrelationContext, e.g. using the DTLS session ID in conjunction with the token. I do not really see, where we run into problems with ambiguously using tokens multiple times for the same target address. Can you come up with an example scenario?

sbernard31 commented 7 years ago

Already explained, but I can try better. The purpose of reserve/release API of tokenProvider is to avoid token collision at token generation. You said we can keep the tokenProvider API like it is, but currently the token provider only ensure there is no collision for same address. With NAT this is a problem.

eg : Server SA send an observe request to device DA (token AAA) with IP 10.0.0.1 This means we can not have another request to 10.0.0.1 with token AAA. Now because of NAT DA get a new IP (10.0.0.2) Server SA send another observe request on another resource to DA, the token provider propose AAA it's ok as 10.0.0.1 and 10.0.0.2 does not have the same IP.

I repeat again, I don't consider this collision is a real problem as this will probably never happened. My conclusion is that if we succeed to identify observation by endpointIdentifier+token we don't need anymore to reserve/release token generate it randomly is enough.

sbernard31 commented 7 years ago

I create a branch with what I have in mind. (Just for discussion)

boaks commented 7 years ago

My sum up:

If everyone agrees on the "uniqueness of a 8 byte random token", then we should just make the TokenProvider optional.

If not: The ip-address change may add more complexity, e.g. see https://github.com/thomas-fossati/draft-tls-cid/issues/9

So assume:

Now S2 is mapped to A1 and we want to add a additional observe. I think, if we now want to use the (new) destination of the new observe request, we must take care not to create conflicts with the so far traffic to A1. (and no, no really details, but I will forecast, there will be issues :-) ).

So in my opinion, there are two consequences:

sophokles73 commented 7 years ago

Server SA send an observe request to device DA (token AAA) with IP 10.0.0.1 This means we can not have another request to 10.0.0.1 with token AAA. Now because of NAT DA get a new IP (10.0.0.2) Server SA send another observe request on another resource to DA, the token provider propose AAA it's ok as 10.0.0.1 and 10.0.0.2 does not have the same IP.

Understood, this is a very good example. So, when using plain UDP (with peers behind NAT FWs) we may run into such a situation and there is nothing we can do about it, right? In the example given, this means that notifications from 10.0.0.2 using token AAA will be processed and notifications arriving from any peer having IP address 10.0.0.1 using token AAA will also be processed. We may want to introduce a lastSeen property on the reserved tokens so that we can determine if (and when) a token is no longer in use (e.g. after EXCHANGE_LIFETIME) so that it can be re-used.

So let's focus on the case where we communicate over TCP or DTLS. The only interesting case seems to be when there is no DTLS session established yet with the peer when an observe request is sent out. As I have already stated before

FMPOV we could/should defer setting the token (and maybe the MID as well) to when we have established the destination CorrelationContext. This, however, poses another problem: if the CorrelationContext can be determined at the Scandium level only (i.e. because the client wants to send an initial request) then the token will already have been set by the Matcher because the RawData structure contains the encoded message (not the Message instance) :-(. Maybe we should also get rid of the Connector interface as well and turn the connector implementations into Layers which are part of the CoapStack. That would probably also allow us to get rid of the multiple Matchers because the job encoding and decoding would be done by the (already transport specific) connectors ...

sbernard31 commented 7 years ago

1) @sophokles73 , @boaks : About TokenProvider : if observation is identify by identity+token, do you think there is a real risk of collision with just a randomly generated token ?

2) @sophokles73, I agree with @boaks : This doesn't make sense to send a request to an unknown peer using a secure connection. what do you think ?

3) @sophokles73, About lastseen do we really want to support this use case ? I mean using unsecure connection in "IP changing" context like NAT environment. I think we can just say we don't support this.

4) @boaks , @sophokles73 did you have a look at my branch ? I will continue on that way. I hope this is not the wrong way.

boaks commented 7 years ago

@sbernard31

I looked at it. But I think, we must first have the basic decision, if a "8 byte random" is unique enough. If we assume that, then I would really vote for make a simpler TokenProvider ("optional" was the wrong term). Then your branch will be somehow "obsolete".

If we want to exclude the nasty effects of token collisions, then we should agree, that the identity must be passed in along with the request. If we agree on that, I think your branch will change, but not too much. In general it points in the right direction.

(So the pain from the token scope in RFC7252 (short term intended) with a "long term usage" of the RFC7641 grows :-) .)

sbernard31 commented 7 years ago

About TokenProvider, It's hard to get your opinion :). Does it mean you think this is a problem or not ? or maybe this means you don't know ? Personally, I already explained why I think this is not a problem : Knowing that each peer will have its "token scope" and knowing that we will probably have not so much simultaneous request. I think we can consider 8 byte is unique enough. (2^64 = 1,844674407×10¹⁹ by peer)

boaks commented 7 years ago

About TokenProvider, It's hard to get your opinion :). Does it mean you think this is a problem or not ? or maybe this means you don't know ?

I "don't know" :-). But analysing a token collision would be hard.

sophokles73 commented 7 years ago

I think the token space is large enough to keep the risk of running out of tokens during runtime at a very low level. I cannot quite follow the argument that there is no risk simply because of the size of the token space. That seems to assume that the random function is distributed perfectly equally (which it probably isn't). I therefore do think that we need to prevent token collisions.

Given the sheer number of available tokens we could, however, just keep one instance of TokenProvider which keeps track of all tokens for all peers.

However, I think we need to make sure that we are not leaking tokens too easily, i.e. that all tokens are released eventually, either explicitly (if the exchange is complete or the observe relation is canceled) or implicitly because the token hasn't been used for a long period. This doesn't affect the need to qualify the tokens with the EndpointIdentifier in the ObservationStore, though, FMPOV.

sbernard31 commented 7 years ago

That seems to assume that the random function is distributed perfectly equally (which it probably isn't)

see : https://docs.oracle.com/javase/7/docs/api/java/util/Random.html nextLong() : Returns the next pseudorandom, uniformly distributed long value from this random number generator's sequence.

sbernard31 commented 7 years ago

Some math to help to the decision : If we want to know collision probability, this is like birthday paradox. birthday_paradox

We have 2^64 possibilities. (number of days in our "year" ^^) At Sierra, we have something like maximum 5 request simultaneously, but let's suppose we have 100 requests. (our number of people)

What is the chance to have collision. Unable to find software able to resolve the real formula with those large numbers. (even wolfram) Using the approximation, the chance of collision is about 2,17x10^-16 for 100 requests.

(6,7*10^-19 for 5 requests)

sbernard31 commented 7 years ago

I would like to move forward on this. @sophokles73, @boaks, Do you still think that collision is a problem ?

boaks commented 7 years ago

From my side, I can live with both:

a.) either rely on the unique token by random b.) or generate a general unique token by keeping a list.

If we go for a.) then I'm not sure, why we should introduce EndpointIdentifier. May be just because I currently don't see, that this is used for more than maintain the "used tokens". Or do I oversee something? Or do you plan to use the EndpointIdentifier in a future extension? Because currently I also wonder, how we can process a notify from a coap server with a changed address.

But I'm not sure, how we would map a notification from a coap-server with a changed ip-address/port.

sbernard31 commented 7 years ago

You're right, if we go for ~a)~ b) we don't need EndpointIdentifier for observation. But for clustering we will need a shared TokenProvider.

If we go for ~b)~ a) we don't need a shared TokenProvider and don't need all the reserve/release token stuff, but we need to identify observation by EndpointIdentifier+Token.

In all case I think, we will need something like EndpointIdentifier for #104, if we want to be sure we send request to the right peer.

About Mapping observation with a changed ip-address/port : with ~a)~ b) we use token as it is unique. with ~b)~ a) we use token+some DTLS data (session ID or identity, ...) for coaps and we don't support it for coap. It seems not to be a problem as it does not make so much sense to use unsecured CoAP behind a NAT.

boaks commented 7 years ago

I'm not sure, if you have exchanged a) and b).

Because a) rely on a "random token is unique", why should then a cluster have a special handling for token generation?

sbernard31 commented 7 years ago

Oops I have exchanged a) and b). I was confused because, you ask about the need of EndpointIdentifier for random generation. If we don't use EndpointIdentifier(so identify observation by token only) and we just use random generation the chance of collision is higher. Eg for 1.000.000 devices with 100 simultaneous request, chance of collision is arround : 0.00027)

boaks commented 7 years ago

Just to mention: I can still imagine to offer an additional API for an upper layer to adjust the address for a "identified endpoint". This may be useful for LWM2M to adjust the address after registration update. Even if this "breaks RFC7252 request/response rules", I think it would be not too bad, if it's only done, when the upper layer actively adjusts the address.

sophokles73 commented 7 years ago

FMPOV the problem is not that there will be some collisions, and according to @sbernard31's reference to the birthday problem the probability will be quite low. The interesting question for me is: what happens if there is a collision? Are we able to detect a collision and act accordingly? What is a proper reaction in case of a collision? As long as we do not have answers to these questions, AFAIC the low probability of occurrence is not a sufficient argument for not having tokens being qualified by the endpoint identifier.

As I have already layed out before, I guess we can make this work for both cases: if client code already has a grip on the target's EndpointIdentifier, then it can set it on the request. If not, then we can set the EndpointIdentifier when sending out the message at the connector layer.

sbernard31 commented 7 years ago

the low probability of occurrence is not a sufficient argument for not having tokens being qualified by the endpoint identifier.

This is not what I advocate. To be clear, I think :

sophokles73 commented 7 years ago

So we seem to agree that we want to qualify each token with the EndpointIdentifier, right?

sbernard31 commented 7 years ago

Yes but if we do that, how do we deal with the token provider ?

1. remove reserve/release token, random generation only. I vote for this one, because this will simplify the code, and risk of collision is extremely low, we can even imagine generator which do not do that randomly and ensure there is no collision at all (like sequential generation or time based token generation)

2. reserve/release token by endpoint identifier @boaks wisely points us there is a problem of chiken-and-egg.

3. reserve/release token for all peer, so 2 peers can not use the same token. In this case I'm not sure we need to qualify token by endpointIdentifier as token will be unique through all peers ? The only reason I see to introduce endpointIdentifier in that case, is just to let users the choice to use random generation in its TokenProvider implementation. So more flexibility for users but more complexity in californium code base (as we should call release token correctly).

sophokles73 commented 7 years ago

So, let me try to get this straight. We need to find answers to two questions:

  1. How do we prevent token collision in normal request/response interactions?
  2. How can we reliably associate a notification with an observe relation (even across client nodes and in the context of the server having its IP/port being changed)?

My understanding is that for the second question we agree to use the EndpointIdentifier in conjunction with the token used for the observe relation, right? So the idea would be that whenever we put an observe request to the ObservationStore, we register it under a composite key (endpoint identifier + token). For the endpoint identifier we would use e.g. the DTLS session id to make it independent from the (potentially changing) IP address/port.

So, now we are discussing how to deal with the first problem. My understanding here is that you, @sbernard31, propose to rely on the fact that collisions are very unlikely even if we simply create the tokens randomly (using the system's PRNG). Together with the fact that in the DTLS case we are using additional properties (session ID, cypher) to correlate a response with a request, I tend to agree with you that this is acceptable.

boaks commented 6 years ago

For me this seems, that we still have a open/ongoing discussion.

And the current implementation seems also to require some more work (createUnusedToken creates a token unique for each address/port, ObserveStore assumes uniqueness for token across address/port. I think, I remember that @sbernard31 mentioned this also already) .

@vikram919 point in https://github.com/eclipse/californium/issues/219#issuecomment-344191369 to the special token usage in multicast (requires uniqueness for token across address/port).

So I guess, after 2.0.0-M6 we have to "finalize this again" :-).

boaks commented 6 years ago

Closed with PR #521 Request/Observes and Response/Notifies are mapped with unique tokens. Security is then provided according by the endpoint context and endpoint context matcher.