zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.66k stars 6.53k forks source link

BT: Host: Notification dropped instead of truncated if bigger than ATT_MTU-3 #23134

Closed aescolar closed 4 years ago

aescolar commented 4 years ago

From slack:

François Hi all, I noticed that gatt_notify can fail with -ENOMEM if the size of the notification is greater than ATT_MTU-3. The core spec seems to indicate that we should still send the first ATT_MTU-3 bytes of the notification, and that the master should then issue a read to get the rest. Is that something that we should fix?
Core spec 5.1: Vol 3 part F section 3.4.7.1 states for ATT notifications: "If the attribute value is longer than (ATT_MTU-3) octets, then only the first (ATT_MTU-3) octets of this attributes value can be sent in a notification. Note: For a client to get a long attribute, it would have to use the Read Blob Request following the receipt of this notification."
However in gatt.c:gatt_notify we just fail with "No buffer available to send notification".
Joakim Andersson François Sounds like you are right.
15:07
François Are we still in time to fix this in the release if I send a PR tomorrow?
15:08
Johan Hedberg @François yes. I’m targeting an rc3 later this week (assuming all know qualification issues are sorted out) and the final release some time next week
15:08
François Okay, I will send a PR tomorrow, it should be a very small fix. Thanks!!
15:09
aescolar commented 4 years ago

CC @jhedberg @joerchan (better a lousy bug report than none :) )

jhedberg commented 4 years ago

Despite what the spec says, silent truncation of data (which in most cases will result in corrupt data) doesn't sound like a smart idea. We could make this "non-silent" on our side by changing the bt_gatt_notify() return value from int to ssize_t and return the number of bytes sent was the return value, instead of 0. That's however an API break, i.e. not something we can do this close to release. In some ways even this change from error return to success return is an API break, although it's unlikely there's code relying on an error return to e.g. trigger MTU exchange.

My current feeling would be to give this some time so we can discuss the options and look for some solution for 2.3. To my understanding this isn't breaking any qualification test case so I wouldn't consider it that critical.

jhedberg commented 4 years ago

Based on some further discussion on Slack, I can understand the opposing argument as well, i.e. keep the GATT server simple and leave it to the client to check MTU and use another procedure to get more data. So please don't stop the creation of a PR just because of my earlier comment :)

Vudentz commented 4 years ago

Id leave the implementation as it is, Id never seem any profile spec that does require reading after notifying since that is racy, in fact that could be a bug in the upper layer that don't check the MTU or haven't triggered a MTU exchange.

jhedberg commented 4 years ago

Since there is some contention on how to proceed with this, I don't think it's appropriate to consider it a medium priority bug that we should try to "fix" for 2.2. I'd rather us spend some more time thinking how to best fix this, if a fix is needed to begin with. From an API perspective, even if we choose to do truncation, it should be possible for the application to notice from the bt_gatt_notify() call that truncation happened, i.e. this should be indicated by the return value.

jhedberg commented 4 years ago

Changed to low priority because:

fnde-ot commented 4 years ago

Note that this issue applies to both notifications and indications.

@Vudentz I think Zephyr should handle this to support the use-case of bt_gatt_notify with conn=NULL, which sends the notification to all connected devices that are subscribed, as each of those devices could have a different MTU.

Given the above use-case, instead of changing the return value of bt_gatt_notify, we should probably indicate the truncation or number of bytes sent in one or more of the callbacks (cfg_match and/or gatt_complete_func) so the application is aware of it. The cfg_match callback would be a nice one for this as it would allow the application to reject the truncated notification and try again after negotiating MTU.

In any case, I agree that this needs further thoughts before a PR, and such an API change can not be done at this late stage.

joerchan commented 4 years ago

Given the bt_gatt_notify_multiple PR by @Vudentz I don't think that having it as a return value is a solution.

Maybe we should just have a kconfig option to enable the behavior?

fnde-ot commented 4 years ago

I would agree with a Kconfig to enable truncation. Should we also make an API change so that the number of bytes sent is sent to the cfg_match callback?

joerchan commented 4 years ago

I would agree with a Kconfig to enable truncation. Should we also make an API change so that the number of bytes sent is sent to the cfg_match callback?

I don't think we should do that. The number of bytes sent could be inferred by using bt_gatt_get_mtu in the cfg_match callback.