RIOT-OS / RIOT

RIOT - The friendly OS for IoT
https://riot-os.org
GNU Lesser General Public License v2.1
4.89k stars 1.98k forks source link

Missing drop implementations in netdev_driver_t::recv #10410

Open maribu opened 5 years ago

maribu commented 5 years ago

Description

Some implementations of netdev_driver_t::recv(netdev_t *dev, void *buf, size_t len, void *info) do not implement the drop case ((buf == NULL) && (len != 0)). This issue should track the effort to provide the missing implementations.

In case the upper layer used for the driver never uses the drop feature of recv, an implementation of the drop feature is not required, if instead corresponding assert()s are added and this is documented properly.

Tracking

Driver State Pull Request
ata8520e Not affected (*) -
at86rf2xx Fixed, backport in progress https://github.com/RIOT-OS/RIOT/pull/10285
cc110x Rewrite contains fix https://github.com/RIOT-OS/RIOT/pull/10340
cc2420 Fix merged https://github.com/RIOT-OS/RIOT/pull/10416
cc2538_rf Not affected
enc28j60 Fix merged https://github.com/RIOT-OS/RIOT/pull/9806
encx24j600 Fix merged https://github.com/RIOT-OS/RIOT/pull/10415
esp32/esp-eth Not affected -
esp32/esp-now Not affected -
esp32/esp-wifi Not affected -
esp8266 Not affected -
ethos Not affected -
kw2xrf Fixed 4ebcd7c055960a12f96f23f9e444684f955a6ca8
kw41zrf (#7107) Not affected -
mrf24j40 Fixed, PR merged https://github.com/RIOT-OS/RIOT/pull/9562
netdev_tun Not affected
nrfble asserts present, doc requried
nrfmin Not affected -
rn2xx3 Not affected (*) -
slipdev Not affected
sx127x AFFECTED
socket_zep Fixed fa2d9bde56e253ec8167d3a4c8d7ddc7c30772b5
w5100 Fix merged https://github.com/RIOT-OS/RIOT/pull/10412
xbee Not affected -

(*) Driver does not use the netdev_driver_t API, so it cannot be affected.


Update 1: Fixed typo Update 2: Updated state of mrf24j40 Update 3: Added kw41zrf Update 4: Cleaned up legend Update 5: Added link to PR for w5100 fix Update 6: Added link to PR for encx24j600, updated state of encx24j600 & w5100 Update 7: Added link to PR for cc2420 and updated its state Update 8: Added esp32 netdev_driver_t implementations Update 9: Added esp8266, confirmed that ata8520e and rn2xx3 are not using the netdev_driver_t API (yet) and are, thus, unaffected Update 10: Corrected state for at86rf2xx Update 11: enc28j60 and cc2420 are now fixed Update 12: w5100 is now fixed Update 13: encx24j600 fix merged Update 14: Added cc2538_rf and netdev_tun Update 15: After looking more closely into cc2538_rf, the error handling in the drop case with pkt_len > CC2538_RF_MAX_DATA_LEN is just fine. So cc2538_rf is not affected

maribu commented 5 years ago

@aabadie: Can you confirm that the ata8520e is not affected? @bergzand : Can you confirm that the mrf24j40 is not affected? @aabadie: Can you confirm that the rn2xx3 is not affected?

Thanks :-)

bergzand commented 5 years ago

@maribu Thanks for picking this up.

@bergzand : Can you confirm that the mrf24j40 is not affected?

Should have been fixed with #9562

maribu commented 5 years ago

Wow, that was a fast response time :-) Thanks, I updated the table accordingly

jnohlgard commented 5 years ago

@maribu thanks for taking this on. I checked the proposed kw41zrf driver in #7107 and it does implement drop and size correctly, dropping the frame when buf == NULL, len != 0

A related API question, should the buffered frame be dropped when recv is called with buf != NULL and len != 0, but len < frame_len? The alternative to dropping would be to keep the buffered frame until the caller provides a big enough buffer, or explicitly drops it using buf=NULL.

bergzand commented 5 years ago

A related API question, should the buffered frame be dropped when recv is called with buf != NULL and len != 0, but len < frame_len? The alternative to dropping would be to keep the buffered frame until the caller provides a big enough buffer, or explicitly drops it using buf=NULL.

Good question. I have a few ideas that could benefit from being able to first read part of the frame before reading the full frame. For example a bit of netdev that first reads the link layer addresses to check whether the device wants the frame or that it should be dropped. No need to spend time pulling in the full frame from the radio then.

bergzand commented 5 years ago

@maribu There is also #10285 which claims to address this issue for the at86rf2xx

maribu commented 5 years ago

The alternative to dropping would be to keep the buffered frame until the caller provides a big enough buffer, or explicitly drops it using buf=NULL.

This would allow the most flexibility, but upper layer code must be prepared to explicitly drop if it does not intend to retry with a bigger buffer. But as drivers can often share upper layer code, I would argue to let the upper layer handle this. Moving complexity to more common used code seems to be a better choice regarding maintainability.

Also the documentation on this could be improved quite a bit:

maribu commented 5 years ago

@bergzand

There is also #10285 which claims to address this issue for the at86rf2xx

I think this is a similar issue, but the drop feature seems to be not missing in the current master:

https://github.com/RIOT-OS/RIOT/blob/836fe3dbbaba8b65bf23110c8b56cbaf1558c022/drivers/at86rf2xx/at86rf2xx_netdev.c#L152-L157

Update: This is a backport of the fix of this issue. So in current master it is fixed. I will add the PR to the tracking

maribu commented 5 years ago

@gebart: Thanks for raising the question on how recv() should behave when the provided buffer is too small. E.g. w5100 and cc2420 handle this differently than the API documentation states: The silently truncate the incoming frame and do not return any error to the upper layer, but the API states that -ENOBUFS has to be return. I think silently truncating is not a good idea, as the upper layers will have no chance to detect this issue. Let me open another issue for that to first discuss how that case should be handled correctly and then to track the conversion.

maribu commented 5 years ago

@gebart: I created https://github.com/RIOT-OS/RIOT/issues/10413 for that

maribu commented 5 years ago

@PeterKietzmann, @jfischer-phytec-iot, @gebart: Regarding the kw2xrf: After taking a closer look, I think it is not affected. The driver does not seem to manually drop the incoming frame in any case (even when retrieving the frame successfully), so I guess just not retrieving the frame implicitly drops it. Is this correct?

jnohlgard commented 5 years ago

If the kw2xdrf radio works like the kw41zrf radio, then the buffer may be overwritten whenever the transceiver is switched back into R sequence, which could happen after reading the frame length depending on the driver implementation. So there may be a race condition bug waiting to happen where reading the size removes the frame buffer write protection, and a new frame is received before the actual read out of the frame buffer has been done in the second call to recv. kw41zrf solves this by switching to standby after a completed RX until the frame is read out or dropped, but there is also a PB_PROTECT bit which can be used to write protect the RX buffer which I did not know about until recently. I have not explored this protection bit in #7107 however. Edit: Fixed some factual mistakes in the description of the receive procedure

bergzand commented 5 years ago

@maribu There are some ESP{32,8266} module that also implement netdev (@gschorcht ping :) ), not sure what the status there is though.

maribu commented 5 years ago

@aabadie: You seem to have worked a lot on the sx127x driver. To me it looks like it is not affected by the bug, as the buffer is implicitly dropped. Is this correct?

maribu commented 5 years ago

@maribu There are some ESP{32,8266} module that also implement netdev (@gschorcht ping :) ), not sure what the status there is though.

The ESP32 is not affected (checked for esp-eth, esp-now and esp-wifi). I didn't checked on the ESP8266 yet.

gschorcht commented 5 years ago

@bergzand ESP8266 and ESP32 netdev drivers drop packets according to interface defitinition. When I started to implement them, I read the API documentation for netdev_driver::recv very carefully :wink:

bergzand commented 5 years ago

@bergzand ESP8266 and ESP32 netdev drivers drop packets according to interface defitinition. When I started to implement them, I read the API documentation for netdev_driver::recv very carefully wink

Awesome, good to hear that. :)

aabadie commented 5 years ago

Can you confirm that the ata8520e is not affected? Can you confirm that the rn2xx3 is not affected?

Since there is no netdev adaption (yet) for these drivers, I don't thinks they are affected.

You seem to have worked a lot on the sx127x driver. To me it looks like it is not affected by the bug, as the buffer is implicitly dropped. Is this correct?

I'm not sure. It might be possible to drop a packet when buf == NULL and len > 0.

PeterKietzmann commented 5 years ago

As discussed offline I assigned @jia200x here.

jia200x commented 5 years ago

I guess this should stay open? @miri64

miri64 commented 5 years ago

Yepp. This was an automatic close due to #10412 being merged, which contained https://github.com/RIOT-OS/RIOT/commit/ea3edef5c7f6704dcf1ca270aca87ccd503024f6 with the magic word ;-).

maribu commented 5 years ago

@PeterKietzmann, @jfischer-phytec-iot, @gebart: Regarding the kw2xrf: After taking a closer look, I think it is not affected. The driver does not seem to manually drop the incoming frame in any case (even when retrieving the frame successfully), so I guess just not retrieving the frame implicitly drops it. Is this correct?

@PeterKietzmann, @jfischer-phytec-iot, @gebart: Could one of you confirm it is unaffected?

miri64 commented 5 years ago

@maribu can this be left close now?

kaspar030 commented 5 years ago

kaspar030 closed this in [727b4ca](/RIOT-OS/RIOT/commit/727b4cac1c850ccd06d825e3c50dfa18a81c33c7) 4 minutes ago

Ups, this was a too-eager autoclose. I'll just update the list.

miri64 commented 5 years ago

@maribu, while I was working on the tracking list for #11163: I think cc2538_rf and netdev_tap are missing in your list. Is there a reason for that

maribu commented 5 years ago

No. I just got confused with some network drivers being in drivers/ and others in cpu/ :-(

maribu commented 5 years ago

Both are not affected

miri64 commented 5 years ago

No. I just got confused with some network drivers being in drivers/ and others in cpu/ :-(

The distinction is simple: in drivers are devices that can be on a board, the one in the one in cpu are specific to that MCU.

maribu commented 5 years ago

Oh, I totally forgot about this issue :-( Did someone by chance fix the remaining drivers?

maribu commented 4 years ago

The kw2xrf driver seems not effect: To me it seems that no command to flash the FIFO is ever triggered during RX - so apparently the device will just starting to write to the head of the FIFO again when receiving the next frame and no flushing is needed. Thus, the driver is not effected.

However, the driver goes back to RX the very moment the frame is completely received, regardless on whether the netdev_driver_t::recv() is ever called. I guess it would be possible that when netdev_driver_t::recv() is not fetching the frame soon enough, part of the buffer could already be overwritten with the next frame. That is certainly not ideal either. Maybe @smlng can confirm / correct my observations from a brief look at the code?

miri64 commented 4 years ago

Maybe @smlng can confirm / correct my observations from a brief look at the code?

@PeterKietzmann @MrKevinWeiss or @leandrolanzieri can you do that, please. Also, what the state of the other drivers marked as AFFECTED in OP?