RIOT-OS / RIOT

RIOT - The friendly OS for IoT
https://riot-os.org
GNU Lesser General Public License v2.1
4.92k stars 1.99k forks source link

ethos: fails to respond to first message. #11988

Open fjmolinas opened 5 years ago

fjmolinas commented 5 years ago

Description

While testing #11818 I realized that after a reset the first reply ethos is supposed to send back to the sender fails. This doesn't happen if using a wireless interface so is unrelated to gnrc.

The easiest way to see this is with a simple ping, but any message from the host requiring an ACK will fail.

As far as I could see the reply was being passed down up-to ethos, but no response was received by the host and _send was being called.

Steps to reproduce the issue

sudo ./dist/tools/ethos/setup_network.sh riot0 2001:db8::/64

patch ``` diff --git a/examples/gnrc_networking/Makefile b/examples/gnrc_networking/Makefile index 25c2dc858..0ac1b1a08 100644 --- a/examples/gnrc_networking/Makefile +++ b/examples/gnrc_networking/Makefile @@ -62,6 +62,29 @@ DEVELHELP ?= 1 # Change this to 0 show compiler invocation lines by default: QUIET ?= 1 +ifeq (1,$(USE_ETHOS)) + GNRC_NETIF_NUMOF := 2 + USEMODULE += stdio_ethos + USEMODULE += gnrc_uhcpc + + # ethos baudrate can be configured from make command + ETHOS_BAUDRATE ?= 115200 + CFLAGS += -DETHOS_BAUDRATE=$(ETHOS_BAUDRATE) + + # make sure ethos and uhcpd are built + TERMDEPS += host-tools + + # For local testing, run + # + # $ cd dist/tools/ethos; sudo ./setup_network.sh riot0 2001:db8::0/64 + # + #... in another shell and keep it running. + export TAP ?= riot0 + TERMPROG = $(RIOTTOOLS)/ethos/ethos + TERMFLAGS = $(TAP) $(PORT) +endif + + include $(RIOTBASE)/Makefile.include # Set a custom channel if needed @@ -77,3 +100,8 @@ else CFLAGS += -DIEEE802154_DEFAULT_CHANNEL=$(DEFAULT_CHANNEL) endif endif + +.PHONY: host-tools + +host-tools: + $(Q)env -u CC -u CFLAGS make -C $(RIOTTOOLS) ```

USE_ETHOS=1 make -C examples/gnrc_networking BOARD=samr21-xpro flash -j3 term

ping6 fe80::2%riot0 -c 1 -W 10
PING fe80::2%riot0(fe80::2%riot0) 56 data bytes

--- fe80::2%riot0 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

You can see it is not a matter of timeouts since I used a 10s timeout.

PING fe80::2%riot0(fe80::2%riot0) 56 data bytes
64 bytes from fe80::2%riot0: icmp_seq=1 ttl=64 time=21.9 ms

--- fe80::2%riot0 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 21.994/21.994/21.994/0.000 ms

Expected results

Should answer to first ping.

Actual results

Fails to answer to first ping.

kaspar030 commented 5 years ago

I can reproduce this.

miri64 commented 5 years ago

Could #12264 be related?

fjmolinas commented 5 years ago

Could #12264 be related?

I don't think so, the package is correctly received, it seems to think the destination is unreachable when the first message is tried to be sent out:

ping6 fe80::2%riot0 -c 1 -W 3
PING fe80::2%riot0(fe80::2%riot0) 56 data bytes

--- fe80::2%riot0 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

Debug activated in gnrc_icmpv6_debug and gnrc_icmpv6_error

> icmpv6_echo: Building echo message with type=huid=129, seq=7726, payload:
00000000  ED  53  8B  5D  00  00  00  00  84  E8  08  00  00  00  00  00
00000010    11  12  13  14  15  16  17  18  19  1A  1B  1C  1D    1F
00000020  20  21  22  23  24  25  26  27  28  29  2A  2B  2C  2D  2E  2F
00000030  30  31  32  33  34  35  36  37

gnrc_icmpv6_error: copying whole packet
gnrc_icmpv6_error: trying to send destination unreachable error
miri64 commented 5 years ago

Well... there are also the more standard compliant https://github.com/RIOT-OS/RIOT/pull/10477 and https://github.com/RIOT-OS/RIOT/pull/10480 :stuck_out_tongue_winking_eye:

miri64 commented 3 years ago

This seems to be related to what I am fixing in #16947. Could you try setting CONFIG_GNRC_IPV6_NIB_QUEUE_PKT=1? As the patch you are describing in OP technically does not create a proper border router, this would need to be set manually (similar as SLAAC needs to be activated in some instances) and is not fixed by #16947. But the problem case described in #16947 is the same, I believe.