Closed mrodgers-witekio closed 3 months ago
Thanks for the detailed report. If you're able to work on this, it'd be great, otherwise just let me know.
- Use the MSG_TRUNC flag when receiving from the socket, so that we know the true length of the packet and whether it was truncated. Use this info when deciding if a block transfer is needed, not just the Block2 option in the response packet.
Detecting that the payload is incomplete would definitely be useful, the application should not receive incomplete payload w/o knowing. Although, perhaps we should combine this with the proposal in point 3, use early negotiation for GET requests, and if the reply doesn't fit the buffer, report an error. If the server ignored the suggestion, there's really not much that can be done with statically allocated buffers.
- Ensure that the block offset is only ever incremented by a multiple of the block size
Definitely.
- (Not necessary but might be helpful): always send a Block2 option in the first request packet, to ensure that the server doesn't respond with more data than the client can handle - ie. Figure 3: Block-Wise GET with Early Negotiation from https://datatracker.ietf.org/doc/html/rfc7959#section-3.1
See 1, IMO it's a good idea to use this mechanism given the buffer size on our side is limited. I hope this is well supported in server implementations.
@SeppoTakalo @pdgendt FYI in case you have an opinion on this.
OK, I should have some time this week to put something together
I think the CoAP client should be fixed so that it rejects any incoming data that would be truncated.
On most cases, the server do not add any Etag option for the data, so we don't exactly know that the new request with modified Block2 options would receive the same payload. So safest option is that client do not silently start a new Block-Wise request out of normal GET request.
But instead, we should properly document how to start the GET with Block2 option already in the first request. I think that should be the application developers responsibility to add that option in the query. However, I'm unsure if this is even working correctly now, I just checked that the struct coap_client_request
has option fields.
Another fix should be that we inform the callback the we received too much data and rejected the request. Maybe we should give the "4.13 Request Entity Too Large" code to callback, even though we don't send anything to server (or can we respond Ack with RESET?).
On a quick look, coap_client.c seem to add all extra options to subsequent block requests. This means that if application starts a Block-Wise GET, on a following requests there will be duplicate Block2 options.
So there is a bug that prevents application requesting Block-Wise.
But instead, we should properly document how to start the GET with Block2 option already in the first request. I think that should be the application developers responsibility to add that option in the query. However, I'm unsure if this is even working correctly now, I just checked that the struct coap_client_request has option fields.
I actually tried this at the application level and it works for the first request. However any subsequent requests then have both:
coap_client
, with the correct value of NUM and potentially a different SZX to the first Block2 option (e.g. in case the server responded with a smaller block size than the application requested).To me it seems cleaner to add the Block2 option to the first request inside the coap_client
instead of parse the application-added options for an existing Block2 option and then modify it. But I'd be open to doing it either way.
To me it seems cleaner to add the Block2 option to the first request inside the
coap_client
instead of parse the application-added options for an existing Block2 option and then modify it. But I'd be open to doing it either way.
I was thinking that only the application developer has the knowledge that this exact resource would lead to bigger responses. Therefore I would push the responsibility to application. If coap_client would add it, then it would be an extra option on all of the requests. That might cause issues.
But the coap_client should be fixed so that if user submitted options have same code, than what the coap_client already used, then it should be dropped. So that Block2 would only be added in the first request by the user, but subsequent requests would use the one calculated from server provided values.
If the server ignored the suggestion,
Then it violates a MUST RFC 7959, 2.4. Using the Block2 Option
To influence the block size used in a response, the requester MAY
also use the Block2 Option on the initial request, giving the desired
size, a block number of zero and an M bit of zero. A server MUST use
the block size indicated or a smaller size.
The good point on having not that many implementations is, that it's also possible to check them ;-).
AFAIK:
Not sure about C# and Rust.
Anyway, I'm not sure, how many applications, which are using two buffers
uint8_t send_buf[MAX_COAP_MSG_LEN];
uint8_t recv_buf[MAX_COAP_MSG_LEN];
will be really that constraint, the MAX_COAP_MSG_LEN is smaller than 1280.
I don't use the coap_client, I use the coap message functions only and handle the sending and receiving on my app. There it's not too hard to use one, but large, receive buffer.
Describe the bug When making requests using
coap_client_req
, responses longer thanCONFIG_COAP_CLIENT_MESSAGE_SIZE
can be truncated, or result in incorrect handling of the subsequent block transfer (depending on the total length of the resource).This issue can manifest itself in three different ways (that I can see), but all are related:
1) Response is longer than
CONFIG_COAP_CLIENT_MESSAGE_SIZE
but shorter thanMAX_COAP_MSG_LEN
(because the CoAP header is shorter than the maximum value).In this case no truncation happens, the whole payload is delivered to the application callback. This is OK although perhaps slightly counter-intuitive: the application developer might have assumed that the callback can't deliver a message payload longer than
CONFIG_COAP_CLIENT_MESSAGE_SIZE
and could have sized downstream buffers etc. accordingly.For example with
CONFIG_COAP_CLIENT_MESSAGE_SIZE=256
and a resource length of 270 bytes:2) Response is longer than
CONFIG_COAP_CLIENT_MESSAGE_SIZE
, longer thanMAX_COAP_MSG_LEN
, but from the server's point of view still fits in a single block (because the server defaults to a larger block size than the client).In this case the application receives a single callback with a truncated response, and no indication that the response is truncated. The CoAP client doesn't send any further block requests to retrieve the rest of the data.
For example with
CONFIG_COAP_CLIENT_MESSAGE_SIZE=256
and a resource length of 512 bytes:But in wireshark, we can see that the server sent a payload of 512 bytes:
3) Response is longer than
CONFIG_COAP_CLIENT_MESSAGE_SIZE
, longer thanMAX_COAP_MSG_LEN
, and longer than the server's default block size.In this case, the CoAP client receives the first response, parses the Block2 option sent by the server, and sends requests for the remaining blocks that it is missing. But because the first packet was longer than
CONFIG_COAP_CLIENT_MESSAGE_SIZE
, the offsets passed to the application callback for all subsequent blocks are wrong - they are not a multiple of the block size. So it appears to the application as though more data has been received than the actual size of the resource (since some data has been received twice).For example with
CONFIG_COAP_CLIENT_MESSAGE_SIZE=256
and a resource length of 2048 bytes:and in Wireshark:
To Reproduce Check out the PR: https://github.com/zephyrproject-rtos/zephyr/pull/76069
Build and run using the instructions in the
README.rst
, usingnative_sim
is easiest.Change the size of the resource sent by the server to see each of the different issues observed above.
Expected behavior Any data that can't fit in the client's receive buffer should be retrieved by subsequent block transfer requests. The offset of each block passed to the application should be correct (ie. a multiple of the block size).
Impact The application can't reliably receive CoAP data of any arbitrary length, unless the server's default block size happens to be less than or equal to the client's max message size.
Environment (please complete the following information):
main
at the time of writing (b8ae3a416b9
).Additional context
I'd be happy to help with a fix for this if that's useful, I can see a few changes that might be needed but could do with some guidance on the best way to go:
1) Use the
MSG_TRUNC
flag when receiving from the socket, so that we know the true length of the packet and whether it was truncated. Use this info when deciding if a block transfer is needed, not just the Block2 option in the response packet. 2) Ensure that the block offset is only ever incremented by a multiple of the block size 3) (Not necessary but might be helpful): always send a Block2 option in the first request packet, to ensure that the server doesn't respond with more data than the client can handle - ie.Figure 3: Block-Wise GET with Early Negotiation
from https://datatracker.ietf.org/doc/html/rfc7959#section-3.1