zephyriot / zephyr-issues

0 stars 0 forks source link

IPv6 over BLE no longer works after commit 2e9fd88 #1511

Closed nashif closed 7 years ago

nashif commented 7 years ago

Reported by James Prestwood:

Running using Zephyr's IPv6 over BLE seems to have been broken recently. I have tested with iotivity-constrained and do see some data but things are not working correctly. I did a git bisect in Zephyr and was led to this commit: {code} commit 2e9fd888bf49e535a5442c1475868f2bf05c5e59 Author: Luiz Augusto von Dentz luiz.von.dentz@intel.com Date: Wed Jan 25 14:36:49 2017 +0200

net: bt: Fix not checking for valid ll addresses

ll addresses need to be set properly before sending as the stack is not
checking if they are NULL.

Change-Id: Ia4e96240f18b53b0e32e21649a8b571c94260731
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>

{code}

Everything before this commit works as expected. After looking at the commit it seems to touch IPv6/BLE code that does some checks on network interfaces. If this is something that changed the way IPv6 over BLE needs to be initialized (like choosing a correct iface) let me know and this can be disregarded, though I did not see any changes in the samples.

(Imported from Jira ZEP-1656)

nashif commented 7 years ago

by Luiz Augusto von Dentz:

Well I can add the patches that have been posted for the Linux kernel fixing the address length for Bluetooth, but Im afraid we wont return to send empty ll addresses as before because that just worked by chance.

Btw, before anyone jumps here saying that Linux should be used as a reference implementation in this case it is not, 6LoWPAN over Bluetooth only works with debugfs for a reason and that is because it is unstable. That said we are working to fix these problem but that might involve bumping the reference Linux kernel when that is complete.

nashif commented 7 years ago

by Sakari Poussa:

Are you saying, that wo/ this patch the src and dst ll addresses are empty in IP over BLE case. And just by good luck the Linux peer accepted them and data flows? And Linux kernel has/will have proper checks in place sometime in the future? And this patch is the preparation for that future scenario?

If this is the case, this patch should be but behind some #ifdef so developers can select which Linux kernel they run Zephyr against. It will take months/years before the Linux kernel patch lands on all boards that Zephyr is trying to inter-op with using BLE/IP/6LoWPAN.

nashif commented 7 years ago

by Luiz Augusto von Dentz:

dst ll address will be empty, there is actually code that reset it to NULL to force nbr cache lookup, and yes there could be possible changes affecting how the routing is done, for example the Linux code is using IFF_POINTOPOINT which may explain why it works with no ll address.

While there is no stable solution, one that doesn't use debugfs, we can pretty much consider this experimental which means things can change, actually they might specially in terms of userspace. With this is mind I don't think putting this behind #ifdef is a good idea since we may end up with several versions before getting anything stable. Also the RFC 7668 (https://tools.ietf.org/html/rfc7668) is actually quite recent and there is basically only Zephyr and Linux implementions so far, in fact it is currently possible to crash the Linux kernel using flooding ping.

What we should definitely do is track what patches are required until it becomes stable and perhaps add a Kconfig to disable nbr cache or the entire ndisc, but probably only works with pointopoint links.

nashif commented 7 years ago

by Sakari Poussa:

What I still don't understand is, that why we are putting this patch in now? What is the problem or bug it is solving? Instead, it is breaking useful feature of running IPv6 over BLE against boards like Joule, MinnowBoard and may others. It is quite hard or time consuming to patch those boards kernels.

What is the proposal to get back to a working and interoperable state between Zephyr and Linux (4.x) kernel? I don't think it is a good practice to put in (unconditional) code which breaks interoperability but complies to the latest (and very recent) spec which nobody (or very few) implementations have picked up.

nashif commented 7 years ago

by Luiz Augusto von Dentz:

There was a crash reported due to lack of ll dst address, which was fixed by introducing net_ipv6_prepare_for_send, and the code as actually changed since then:

https://gerrit.zephyrproject.org/r/#/c/10679/

As I mentioned we may introduce a Kconfig option to disable ndisc or parts of it, like we used to have with contiki, in fact contiki stack had the exact same problem with Linux and unfortunately we did not have time to fix Linux at that time so I hope we don't end up working around it again.

Btw, ndisc is quite a fundamental part of ipv6, it is like arp to ipv4, that is probably why we don't have a Kconfig for disabling in the IP stack anymore so any claims of iop in this regard are probably wrong, so the Kconfig will probably need to be for internal use of Bluetooth.

nashif commented 7 years ago

by Luiz Augusto von Dentz:

Btw, using multi-cast address would completely bypass the problem.

nashif commented 7 years ago

by Patrik Flykt:

Neither Zephyr nor Linux properly implements RFC 7668 right now and both need to be fixed. Starting with setting the length and the contents of the MAC address correct and ending with properly handling Neighbor discovery messages and 6lowpan message compression.

nashif commented 7 years ago

by Luiz Augusto von Dentz:

Patches have been merged.

nashif commented 7 years ago

by Paul Sokolovsky:

Btw, before anyone jumps here saying that Linux should be used as a reference implementation

It seems that nobody took the offer, so I do: https://github.com/zephyrproject-rtos/zephyr/pull/1151 . It doesn't really make much sense to say "there is basically only Zephyr and Linux implementions so far", and purposedly make Zephyr implementation incompatible with Linux, there's only 2 of them, if they don't talk to each other, nothing works at all.