eclipse-leshan / leshan

Java Library for LWM2M
https://www.eclipse.org/leshan/
BSD 3-Clause "New" or "Revised" License
653 stars 407 forks source link

OSCORE fallback use cases #1257

Open sbernard31 opened 2 years ago

sbernard31 commented 2 years ago

Using OSCORE, I found a pretty simple use case where the client will not be able to connect to the server anymore.

@rikard-sics any idea about this ? and how we could handle this ?

rikard-sics commented 2 years ago

Spontaneous reaction: This should be solved by making sure the Client uses the Appendix B.2 procedure defined in the OSCORE RFC here: https://datatracker.ietf.org/doc/html/rfc8613#appendix-B.2

It will re-establish a new OSCORE Security Context, meaning most importantly new Sender/Recipient Keys and a fresh replay window on the server. By doing that the Client restarting from Sender Sequencer Number 0 is not Key & Nonce reuse, and not a replay either. (Note that the Appendix B.2 procedure changes the ID Context but keeps the same Sender/Recipient IDs).

So basically if the client loses mutable state information in the context (like the Sender Sequence Number), it must use this procedure the first time it resumes communication with the server. I believe in the current code the Client will at least use this procedure when it registers to the server.

That being said I need to investigate this case to understand if the Client is trying to use Appendix B.2 in this case. If not, using it should solve this. If the client is trying to use it already, we need to investigate things further.

sbernard31 commented 2 years ago

This should be solved by making sure the Client uses the Appendix B.2 procedure defined in the OSCORE RFC here: https://datatracker.ietf.org/doc/html/rfc8613#appendix-B.2

I don't know what should be done in the code to be sure that appendex B.2 is used ? :thinking:

I believe in the current code the Client will at least use this procedure when it registers to the server.

I think it makes sense. But as you said that "Note that the Appendix B.2 procedure changes the ID Context " What the purpose to set the Context ID in OSCORE object (21) at bootstrap time :thinking: ?

sbernard31 commented 2 years ago

@rikard-sics I would really appreciate if we can find a fix (or at least workaround) for this. :pray:

sbernard31 commented 2 years ago

I try to investigate on this.

I don't know what should be done in the code to be sure that appendex B.2 is used ?

Looking at cf test, I saw that maybe the carlifornium API to do that is :

// Enable context re-derivation functionality (in general)
ctx.setContextRederivationEnabled(true);
// Explicitly initiate the context re-derivation procedure
ctx.setContextRederivationPhase(PHASE.CLIENT_INITIATE);

So I tried to add this to InMemoryOscoreContextDB.deriveContext :

private static OSCoreCtx deriveContext(OscoreParameters oscoreParameters) {
    try {
        OSCoreCtx osCoreCtx = new OSCoreCtx(oscoreParameters.getMasterSecret(), true,
                oscoreParameters.getAeadAlgorithm(), oscoreParameters.getSenderId(),
                oscoreParameters.getRecipientId(), oscoreParameters.getHmacAlgorithm(), 32,
                oscoreParameters.getMasterSalt(), null, 1000);
        osCoreCtx.setContextRederivationEnabled(true);
        // 👇 I try to add this line below 👇
        osCoreCtx.setContextRederivationPhase(PHASE.CLIENT_INITIATE);
        return osCoreCtx;
    } catch (OSException e) {
        LOG.error("Unable to derive context from {}", oscoreParameters, e);
        return null;
    }
}

But then I'm not able to connect with OSCORE anymore because of : java.lang.IllegalArgumentException: Internal Leshan operations should always use a null ID Context. Raised by InMemoryOscoreContextDB.getContext(byte[] rid, byte[] IDContext)

So I tried to remove the check just for testing but this failed with :

org.eclipse.californium.scandium.dtls.cipher.InvalidMacException: MAC validation failed
    at org.eclipse.californium.scandium.dtls.cipher.CCMBlockCipher.decrypt(CCMBlockCipher.java:406)
    at org.eclipse.californium.scandium.dtls.cipher.CCMBlockCipher.decrypt(CCMBlockCipher.java:335)
    at org.eclipse.californium.cose.EncryptCommon.AES_CCM_Decrypt(EncryptCommon.java:134)
    at org.eclipse.californium.cose.EncryptCommon.decryptWithKey(EncryptCommon.java:74)
    at org.eclipse.californium.cose.Encrypt0Message.decrypt(Encrypt0Message.java:138)
    at org.eclipse.californium.oscore.Decryptor.decryptAndDecode(Decryptor.java:140)
    at org.eclipse.californium.oscore.RequestDecryptor.decrypt(RequestDecryptor.java:105)
    at org.eclipse.californium.oscore.ObjectSecurityLayer.prepareReceive(ObjectSecurityLayer.java:104)
    at org.eclipse.californium.oscore.ObjectSecurityLayer.receiveRequest(ObjectSecurityLayer.java:312)

Finally, I get to the conclusion that I'm not really able to do that without someone who really well understands OSCORE RFC and OSCORE Californium code, so I give up for now to try to do this alone.

@rikard-sics, I'm waiting for your help :pray: !

You're my only hope

rikard-sics commented 2 years ago

@rikard-sics, I'm waiting for your help pray !

Hmm, it is hard to say what is going wrong from just a look at the output. I will try to take some time to test and see if I can figure out what is happening. Today is a bit tricky but I will try to do it as soon as I can.

JaroslawLegierski commented 1 year ago

@rikard-sics I reproduced the bug described in this issue and I have first question- the second response from the server is COAP not OSCORE message: packets However in client log I see the message "Incoming response is NOT OSCORE protected but is expected to be! ":

2023-07-28 16:59:15,874 UdpMatcher           [TRACE] received response ACK-4.01   MID=36245, Token=1C0DEB1355706ECA, OptionSet={"Content-Format":"text/plain", "Max-Age":0}, "Replay detected" from UDP(127.0.0.1:5683)  
2023-07-28 16:59:15,876 EndpointContextUtil  [TRACE] udp context receiving, PLAIN: "" == ""  
2023-07-28 16:59:15,876 ObjectSecurityLayer  [INFO] Incoming response is NOT OSCORE protected but is expected to be!  
2023-07-28 16:59:15,876 BlockwiseLayer       [DEBUG] [LWM2M Client-coap://0.0.0.0:0]  received error ACK-4.01   MID=36245, Token=1C0DEB1355706ECA, 

The question is : should the 401 response be a plain text COAP or OSCORE protected message ?

JaroslawLegierski commented 1 year ago

Based on information from this and this example, I concluded that maybe we need two different implementations of deriveContext method in InMemoryOscoreContextDB class ?:

for server:

    private static OSCoreCtx deriveContext(OscoreParameters oscoreParameters) {
        try {
            OSCoreCtx osCoreCtx = new OSCoreCtx(oscoreParameters.getMasterSecret(), false,
                    oscoreParameters.getAeadAlgorithm(), oscoreParameters.getSenderId(),
                    oscoreParameters.getRecipientId(), oscoreParameters.getHmacAlgorithm(), 32,
                    oscoreParameters.getMasterSalt(), null, 4096);
            osCoreCtx.setContextRederivationEnabled(true);
            return osCoreCtx;
        } catch (OSException e) {
            LOG.error("Unable to derive context from {}", oscoreParameters, e);
            return null;
        }
    }

and for client:

    private static OSCoreCtx deriveContext(OscoreParameters oscoreParameters) {
        try {
            OSCoreCtx osCoreCtx = new OSCoreCtx(oscoreParameters.getMasterSecret(), true,
                    oscoreParameters.getAeadAlgorithm(), oscoreParameters.getSenderId(),
                    oscoreParameters.getRecipientId(), oscoreParameters.getHmacAlgorithm(), 32,
                    oscoreParameters.getMasterSalt(), null, 4096);
            osCoreCtx.setContextRederivationEnabled(true);
            osCoreCtx.setContextRederivationPhase(ContextRederivation.PHASE.CLIENT_INITIATE);
            return osCoreCtx;
        } catch (OSException e) {
            LOG.error("Unable to derive context from {}", oscoreParameters, e);
            return null;
        }
    }

Another problem is related to the DEREGISTRATION/REGISTRATION event in EventServlet on server site. As result of context rederivation procedure client is deregistered and registered again and therefore removeContext method in OscoreContextCleaner is cleaning the context. To prevent this, I made the following modification in removeContext method.

    private void removeContext(byte[] rid) {
        OSCoreCtx context = oscoreCtxDB.getContext(rid);
        if (context != null)
            //JL test
            if (!oscoreCtxDB.getContext(rid).getContextRederivationEnabled()) {
                oscoreCtxDB.removeContext(context);
            }
    }

I don't know if checking contextRederivationEnabled flag is the best solution - but currently I don't have better idea ...

The next problem concerns the use of IDContext. Leshan operations can use internally null ID Context also during context rederivation procedure. I set IDContext=null in client and server site and procedure described in OSCORE appendix-B.2 is finishing succesfully.

This PoC is available here

I found in appendix-B.2 that IDContext is used:

Does it mean that if we assume in leshan internally IDContext=null can we lose some OSCORE functionality ? :worried:

JaroslawLegierski commented 1 year ago

@sbernard31 Do you have any LwM2M device with OSCORE that you could recommend for testing ? I'm looking for device with properly/fully implemented OSCORE that I could to use for testing my changes in Leshan code.

sbernard31 commented 1 year ago

No idea, I didn't play so much with OSCORE. At first sight :

JaroslawLegierski commented 1 year ago

I prepared next PoC oriented on Appendix.B.2 implementation. In deriveContext method in InMemoryOscoreContextDB class I'm using IDContext!=null.

Please find here full implementation here.

Unfortunately to working this implementation properly following modification is mandatory in Califorium OSCORE part (class ContextRederivation line 340):

            if (contextID == null|| (Arrays.equals(contextID, ctx.getIdContext()))
                                  &&!ctx.getContextRederivationEnabled())  //<- my modification
                         {
                return ctx;
            }

@rikard-sics How do you see the possibility adding such modification to OSCORE library in Cf ?

JaroslawLegierski commented 1 year ago

I prepared next PoC implementing some "automatization" of leshan-client-demo fallback detection.

This PoC has been developed in 2 version:

In this PoC leshan-client-demo is starting OSCORE Appendix B.2 procedure based on 4.01 (Unauthorized) response from the server.

b2_with_401

@rikard-sics What is Your opinion about such procedure of context derivation initialization ?

rikard-sics commented 1 year ago

@rikard-sics What is Your opinion about such procedure of context derivation initialization ?

Let me take a look during the beginning of next week, and I can follow up in this thread.

JaroslawLegierski commented 1 year ago

Reading carefully OSCORE Appendix B.2 I found that: The server sends a 4.01 (Unauthorized) response protected with the second security context, containing R2 wrapped in a CBOR bstr as 'kid context', and caches R2.

In this PoC leshan-client-demo is starting OSCORE Appendix B.2 procedure based on 4.01 (Unauthorized) response from the server.

b2_with_401

@rikard-sics What is Your opinion about such procedure of context derivation initialization ?

@rikard-sics Does this mean that mentioned above the 401 Unauthorized message should be also OSCORE encoded?