eclipse-californium / californium

CoAP/DTLS Java Implementation
https://www.eclipse.org/californium/
Other
729 stars 365 forks source link

Californium initiates a DTLS handshake when running on server side #2235

Closed bggov closed 7 months ago

bggov commented 7 months ago

We're running a LWM2M server based on Leshan and Californium. Many of the clients connected to the server are using secured connection trough DTLS, but on some of the clients, maybe because of some network issues, the client suddenly deregisters from the server and immediately terminates the DTLS session. After that the server tries to send a successful deregistration response to the client, but as the client has already closed the DTLS session, this leads Californium to try to do a new handshake with the client.

The client is built based on RFC 9147 document, which describes that the client initiates the handshake sending a DTLS ClientHello request to the server, but not the server. That's why the client doesn't handle the server handshake request, but this looks annoying on the client's side.

Currently I didn't found a proper way to control Californium whether it's has to initiate a DTLS handshake or not. That's why when Californium is running on a server side and DTLS session was closed by the client, I would like to ask you:

Please see also the attached log snippet below:

2024-03-13 05:32:42.365 TRACE 22307 --- [DTLS-Receiver-491-0.0.0.0/0.0.0.0:5784] o.e.californium.scandium.DTLSConnector   : Received 1 DTLS records from 10.10.10.10:4059 using a 16490 byte datagram buffer
2024-03-13 05:32:42.365 TRACE 22307 --- [DTLS-Receiver-491-0.0.0.0/0.0.0.0:5784] o.e.californium.scandium.DTLSConnector   : connection available for /10.10.10.10:4059,null
2024-03-13 05:32:42.365 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.californium.scandium.DTLSConnector   : Received DTLS record of type [Application Data (23)], length: 35, [epoche:1,rseqn:10]
2024-03-13 05:32:42.365 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.c.s.dtls.DtlsAeadConnectionState     : decrypt: 19 bytes
2024-03-13 05:32:42.366 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.c.s.dtls.DtlsAeadConnectionState     : nonce: 2BCE90B7000100000000000A
2024-03-13 05:32:42.366 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.c.s.dtls.DtlsAeadConnectionState     : adata: 000100000000000A17FEFD0013
2024-03-13 05:32:42.366 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.c.scandium.dtls.DTLSContext          : Updated receive window with sequence number [10]: new upper boundary [10], new bit vector [11111111111]
2024-03-13 05:32:42.366 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#195] o.e.californium.scandium.DTLSConnector   : Received APPLICATION_DATA for DTLS(10.10.10.10:4059,ID:54494D453A)
2024-03-13 05:32:42.366 DEBUG 22307 --- [CoapServer(main)#600] o.e.c.c.n.d.SweepDeduplicator            : add exchange for KeyMID[10.10.10.10:4059-2919]
2024-03-13 05:32:42.366 TRACE 22307 --- [CoapServer(main)#600] o.e.l.s.c.registration.RegisterResource  : DELETE received : NON-DELETE MID= 2919, Token=C8, OptionSet={"Uri-Path":["rd","ha9fR21fra"]}, <empty data>
2024-03-13 05:32:42.382  INFO 22307 --- [CoapServer(main)#600] d.DeviceTenantMapperInventoryServiceImpl : Searching for tenant mapper managed object with endpoint: test:client:1234
2024-03-13 05:32:42.382  INFO 22307 --- [CoapServer(main)#600] d.DeviceTenantMapperInventoryServiceImpl : Searching for tenant mapper managed object with external id ID [type=c8y_lwm2m_dri_id, value=test:client:1234]
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Receiver-495-0.0.0.0/0.0.0.0:5784] o.e.californium.scandium.DTLSConnector   : Received 1 DTLS records from 10.10.10.10:4059 using a 16490 byte datagram buffer
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Receiver-495-0.0.0.0/0.0.0.0:5784] o.e.californium.scandium.DTLSConnector   : connection available for /10.10.10.10:4059,null
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.californium.scandium.DTLSConnector   : Received DTLS record of type [Alert (21)], length: 18, [epoche:1,rseqn:11]
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : decrypt: 2 bytes
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : nonce: 2BCE90B7000100000000000B
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : adata: 000100000000000B15FEFD0002
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.californium.scandium.DTLSConnector   : Processing WARNING ALERT from [10.10.10.10:4059]: CLOSE_NOTIFY
2024-03-13 05:32:42.389 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.scandium.dtls.DTLSContext          : Updated receive window with sequence number [11]: new upper boundary [11], new bit vector [111111111111]
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.californium.scandium.DTLSConnector   : send ALERT Alert Protocol
Level: WARNING
Description: CLOSE_NOTIFY
 for peer 10.10.10.10:4059.
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : encrypt: 2 bytes
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : nonce: 97BF2856000100000000000A
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : adata: 000100000000000A15FEFD0002
2024-03-13 05:32:42.389 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#193] o.e.c.s.dtls.DtlsAeadConnectionState     : ==> 18 bytes
2024-03-13 05:32:42.428 DEBUG 22307 --- [CoapServer(main)#600] o.e.l.s.r.RegistrationHandler            : Deregistered client: Registration [registrationDate=Wed Mar 13 05:20:19 UTC 2024, identity=Identity /10.10.10.10:4059[x509=test:client:1234], lifeTimeInSec=432000, smsNumber=null, lwM2mVersion=1.1, bindingMode=[U], queueMode=false, endpoint=test:client:1234, id=ha9fR21fra, objectLinks=[</6/0>, </4>;ver=1.0, </4/0>, </1/0>, </3/0>], additionalRegistrationAttributes={}, rootPath=/, supportedContentFormats=[], supportedObjects={}, availableInstances=[], lastUpdate=Wed Mar 13 05:20:19 UTC 2024, applicationData={}]
2024-03-13 05:32:42.428 DEBUG 22307 --- [CoapServer(main)#600] o.e.c.c.network.stack.ReliabilityLayer   : Exchange[R446608, 10.10.10.10:4059] send response null-2.02   MID=   -1, Token=null, OptionSet={}, <empty data>
2024-03-13 05:32:42.428 TRACE 22307 --- [CoapServer(main)#600] o.e.c.c.network.stack.ReliabilityLayer   : Exchange[R446608, 10.10.10.10:4059] set response type to NON (request was NON)
2024-03-13 05:32:42.428 TRACE 22307 --- [CoapServer(main)#600] o.e.californium.core.network.Exchange    : Exchange[R446608, 10.10.10.10:4059, complete]!

java.lang.Throwable: Exchange[R446608, 10.10.10.10:4059, complete]
    at org.eclipse.californium.core.network.Exchange.setComplete(Exchange.java:1041)
    at org.eclipse.californium.core.network.UdpMatcher.sendResponse(UdpMatcher.java:191)
    at org.eclipse.californium.core.network.CoapEndpoint$OutboxImpl.sendResponse(CoapEndpoint.java:933)
    at org.eclipse.californium.core.network.stack.BaseCoapStack$StackBottomAdapter.sendResponse(BaseCoapStack.java:236)
    at org.eclipse.californium.core.network.stack.ReliabilityLayer.sendResponse(ReliabilityLayer.java:150)
    at org.eclipse.californium.core.network.stack.BlockwiseLayer.sendResponse(BlockwiseLayer.java:760)
    at org.eclipse.californium.core.network.stack.ObserveLayer.sendResponse(ObserveLayer.java:140)
    at org.eclipse.californium.core.network.stack.AbstractLayer.sendResponse(AbstractLayer.java:74)
    at org.eclipse.californium.core.network.stack.ExchangeCleanupLayer.sendResponse(ExchangeCleanupLayer.java:95)
    at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.sendResponse(BaseCoapStack.java:194)
    at org.eclipse.californium.core.network.stack.BaseCoapStack.sendResponse(BaseCoapStack.java:101)
    at org.eclipse.californium.core.network.CoapEndpoint.sendResponse(CoapEndpoint.java:720)
    at org.eclipse.californium.core.network.Exchange.sendResponse(Exchange.java:503)
    at org.eclipse.californium.core.server.resources.CoapExchange.respond(CoapExchange.java:539)
    at org.eclipse.californium.core.server.resources.CoapExchange.respond(CoapExchange.java:355)
    at org.eclipse.leshan.server.californium.registration.RegisterResource.handleDeregister(RegisterResource.java:265)
    at org.eclipse.leshan.server.californium.registration.RegisterResource.handleDELETE(RegisterResource.java:122)
    at org.eclipse.californium.core.CoapResource.handleRequest(CoapResource.java:267)
    at org.eclipse.leshan.core.californium.LwM2mCoapResource.handleRequest(LwM2mCoapResource.java:51)
    at org.eclipse.californium.core.server.ServerMessageDeliverer.deliverRequest(ServerMessageDeliverer.java:133)
    at org.eclipse.californium.core.network.stack.BaseCoapStack$StackTopAdapter.receiveRequest(BaseCoapStack.java:204)
    at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:84)
    at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:84)
    at org.eclipse.californium.core.network.stack.BlockwiseLayer.receiveRequest(BlockwiseLayer.java:556)
    at org.eclipse.californium.core.network.stack.ReliabilityLayer.receiveRequest(ReliabilityLayer.java:296)
    at org.eclipse.californium.core.network.stack.AbstractLayer.receiveRequest(AbstractLayer.java:84)
    at org.eclipse.californium.core.network.stack.BaseCoapStack.receiveRequest(BaseCoapStack.java:126)
    at org.eclipse.californium.core.network.CoapEndpoint$1.receiveRequest(CoapEndpoint.java:307)
    at org.eclipse.californium.core.network.UdpMatcher$2.run(UdpMatcher.java:289)
    at org.eclipse.californium.elements.util.SerialExecutor$1.run(SerialExecutor.java:293)
    at org.eclipse.californium.core.network.CoapEndpoint$6.run(CoapEndpoint.java:1364)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:829)

2024-03-13 05:32:42.428 DEBUG 22307 --- [CoapServer(main)#600] o.e.californium.core.network.Exchange    : Remote Exchange[R446608, 10.10.10.10:4059, complete] completed NON-2.02   MID=39795, Token=C8, OptionSet={}, <empty data>!
2024-03-13 05:32:42.428 TRACE 22307 --- [CoapServer(main)#600] o.e.californium.scandium.DTLSConnector   : connection available for /10.10.10.10:4059,null
2024-03-13 05:32:42.428 TRACE 22307 --- [CoapServer(main)#600] o.eclipse.californium.core.coap.Message  : Message transfer completed NON-2.02   MID=39795, Token=C8, OptionSet={}, <empty data>
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.californium.scandium.DTLSConnector   : Sending application layer message to [DTLS(10.10.10.10:4059,ID:54494D453A)]
2024-03-13 05:32:42.429 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.ClientHandshaker     : handshake started dtls-con: CID=D423B9, 10.10.10.10:4059, is alive
2024-03-13 05:32:42.429 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.Connection           : Handshake with [10.10.10.10:4059] has been started
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : Prepare flight 1, using max. datagram size 1436, max. fragment size 16384 [mhm=false, mr=true]
2024-03-13 05:32:42.429 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : Add CLIENT_HELLO (1) message of 128 bytes for [10.10.10.10:4059]
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : Effective max. datagram size 1436, max. message size 1423
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : Sending record of 141 bytes to peer [10.10.10.10:4059]:
==[ DTLS Record ]==============================================
Content Type: Handshake (22)
Peer address: null
Version: 254, 253
Epoch: 0
Sequence Number: 0
Length: 128 bytes
Fragment:
    Handshake Message
    Type: CLIENT_HELLO (1)
    Message Sequence No: 0
    Length: 116 bytes
        Version: 254, 253
    <SOME DATA>
===============================================================

2024-03-13 05:32:42.429 DEBUG 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : Sending datagram of 141 bytes to peer [10.10.10.10:4059]
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.ClientHandshaker     : Sending flight of 1 message(s) to peer [10.10.10.10:4059] using 1 datagram(s) of max. 1436 bytes and 10000 ms timeout.
2024-03-13 05:32:42.429 TRACE 22307 --- [DTLS-Worker-0.0.0.0/0.0.0.0:5784#200] o.e.c.scandium.dtls.DTLSFlight           : handshake flight to peer 10.10.10.10:4059, retransmission 10000 ms.

Information ahead to speed up the processing of issues:

boaks commented 7 months ago

Which peer acts as DTLS client or server (or CoAP client or server), may not match the expectations, if a "role exchange" is used. Californium controls the supported DTLS roles of an peer with DtlsConfig.DTLS_ROLE and supports with DtlsRole: CLIENT_ONLY, SERVER_ONLY and BOTH classic single roles and role exchanges as well. From the past (years ago) I know, that in LwM2M there was discussions, if such an role exchange makes sense or not. AFAIK, leshan is configured to use that "DTLS role exchange" but for more details you may better ask the Leshan project.

bggov commented 7 months ago

Hello @boaks, thank you the quick response and for the advise. Setting the SERVER_ONLY property works for our case

sbernard31 commented 7 months ago

The client is built based on RFC 9147 document, which describes that the client initiates the handshake sending a DTLS ClientHello request to the server, but not the server. That's why the client doesn't handle the server handshake request, but this looks annoying on the client's side.

I think there is a lot of confusion here.

The DTLS specification defines wording :

But this is only DTLS wording.

In you case when you say client in a general way, but I guess you mean LWM2M client. And a LWM2M client doesn't necessarily act as a client for all lower layer.

E.g. :

For DTLS this is less clear, the LWM2M specification v1.0.x says : "The client-server roles of DTLS, which indicate who initiates the DTLS handshake, are independent from the client-server relationship of LwM2M."

LWM2M v1.1.x says : "In LwM2M version 1.1 the LwM2M Client is always the TLS/DTLS client." (But it seems it is only about LWM2M client / LWM2M Bootstrap server relation)

So just to says that RFC 9147 doesn't say anything about who should initiate a handshake at CoAP or LWM2M level.

For a LWM2M client, only LWM2M specification should define this and AFAIK this is still unclear which DTLS role should have a LWM2M client or server. (more discussion about that at : https://github.com/OpenMobileAlliance/OMA_LwM2M_for_Developers/issues/410)

boaks commented 7 months ago

I think there is a lot of confusion here.

That may even start with RFC 9147, while Californium/Scandium implements RFC6347.

Anyway, if someone has doubts about the benefit from initiating handshakes from the LwM2M server (maybe, because the LwM2M client is not able to act as DTLS server), then it may be switched off, if the upper layer allows to do so.

So the main question is, if leshan (the upper layer) supports to run the LwM2M server as DTLS SERVER_ONLY or not. But that's a question for the leshan project and therefore I recommended to ask the leshan project for more details.

sbernard31 commented 7 months ago

That may even start with RFC 9147, while Californium/Scandium implements RFC6347.

Right.

Anyway, if someone has doubts about the benefit from initiating handshakes from the LwM2M server (maybe, because the LwM2M client is not able to act as DTLS server), then it may be switched off, if the upper layer allows to do so.

I agree that role exchange is not always useful. (e.g. In Leshan it is deactivated for Bootstrap Server) And I would be happy to have a better solution than this for https://github.com/OpenMobileAlliance/OMA_LwM2M_for_Developers/issues/410.

So the main question is, if leshan (the upper layer) supports to run the LwM2M server as DTLS SERVER_ONLY or not. But that's a question for the leshan project and therefore I recommended to ask the leshan project for more details.

Short answer : yes it does. You can switch DTLS_ROLE globally when using Californium Endpoint Provider. And you can even do that request by request.

bggov commented 7 months ago

I can also confirm that Leshan support setting DTLS_ROLE or any other Californium properties settings. We're using our own implementation of the Leshan server and it also any Californium properties so we can configure both Leshan and Californium.

In our case the client (I mean LWM2M client, but not Leshan LWM2M) is implemented in a way so it operates only in a client mode. That's why the understanding from client side regarding RFC 9147 is that the client is supposed to send ClientHello handshak requests, but not to handle the same requests sent from the other side. In this case the understanding for the LWM2M server is that it operates from a DTLS perspective only as a server. That's why having the the possibility to set the LWM2M server as a DTLS SERVER_ONLY is an extraordinary feature, and having the possibility to set this trough the Leshan is really great! Thank you again @boaks and @sbernard31 for the support and the followup :+1:

boaks commented 7 months ago

operates only in a client mode.

Just to be precise: any LwM2M client implements also a CoAP-server ;-).

Though UDP doesn't require to stick to a specific client or server role (in opposite to TCP), a implementation may decide to support such a DTLS role exchange or not. The left question is then more, if the ip-layer will transport the ip-message to the other peer and in many network setups, this is only valid for a short period without message exchanges. That's the pain, and therefore the DTLS role exchange isn't used too frequently. Anyway, the most other DTLS implementation are derived from a TLS implementation and therefore they stick to the roles as they are used from TLS and so didn't consider to implement a role exchange. Both, the ip-layer and the TLS-implementation limits then cause, that it is not common.

sbernard31 commented 7 months ago

Thank you again @boaks and @sbernard31 for the support and the followup 👍

You're welcome :slightly_smiling_face:

Both, the ip-layer and the TLS-implementation limits then cause, that it is not common.

Agree with that.