RIOT-OS / Release-Specs

Specification for RIOT releases and corresponding test configurations
4 stars 21 forks source link

Release 2018.04 - RC2 #62

Closed kaspar030 closed 6 years ago

kaspar030 commented 6 years ago

This issue lists the status of all tests for the Release Candidate 2 of the 2018.04 release.

Specs tested:

kaspar030 commented 6 years ago

FYI, nightly build results are here.

The changes to -RC1 are quite minimal (arduino and gcoap related only). What do you think, can we re-use the tests that were done (mostly by @aabadie) in #61 for -RC1? How did we handle that for the last releases?

cladmi commented 6 years ago

I will try to run the simple tests on IoT-LAB. Both M3 and wsn430. I will see how it goes.

aabadie commented 6 years ago

@kaspar030, HIL reports are still pointing to a non existing page, see here for example.

cladmi commented 6 years ago

I am testing all tests that have TESTS_ON_CI_WHITELIST

The most important problem for the moment is that unittests fail for iotlab-m3 and iotlab-a8-m3 in the tests-qDSAI get a hardfault in the keypair function.

wsn430 have errors too, some tests to fix and also other broken things. I will do a more detailed report.

kaspar030 commented 6 years ago

unittests fail for iotlab-m3 and iotlab-a8-m3 in the tests-qDSAI get a hardfault in the keypair function.

Can you try removing the LARGE_STACK_TESTS avr condition (so tests-qDSA is always "large stack")?

cladmi commented 6 years ago

I was testing them with all tests included so should already have been a LARGE_STACK_TESTS. It is still broken with the following:

-ifneq (, $(filter $(AVR_BOARDS), $(BOARD)))
-  LARGE_STACK_TESTS += tests-qDSA
-endif
+LARGE_STACK_TESTS += tests-qDSA

I will do the test report, a few fix PRs and I can start investigating.

cladmi commented 6 years ago

I ran test on tests/directory that are supported, with enough memory, and had the 'TEST_ON_CI_WHITELIST' tag. I did run on wsn430-v1_4, iotlab-m3, and iotlab-a8-m3. I was not using @kYc0o PR to add TEST_ON_CI_WHITELIST on other tests yet. It can be a next step to test more cases.

Tests fixed on wsn430:

Test broken on iotlab-m3 iotlab-a8-m3

Tests I have trouble running because the interactive test start before the board is ready

TODO: make the start condition more robust

Tests not working on board

I think I should simply blacklist the wsn430 boards here

Board gets stuck during test

kaspar030 commented 6 years ago

wsn430

That doesn't look very good. I guess msp430 needs quite some work overall...

cladmi commented 6 years ago

I put the PRs for wsn430 in the release milestone https://github.com/RIOT-OS/RIOT/milestone/22

kaspar030 commented 6 years ago

I'm starting with the 05- tests on native now.

kYc0o commented 6 years ago

I'm on 6.

kYc0o commented 6 years ago

I'm on 7. For now the last three tasks (2, 3 and 4)

kaspar030 commented 6 years ago

05- on native had some problems, see #63 and https://github.com/RIOT-OS/RIOT/issues/9061.

kYc0o commented 6 years ago

I'm struggling a lot with the RPL tests. I don't have a real clue what might be wrong, but then, should we stop the release until the problem is found?

kYc0o commented 6 years ago

For info, it seems broken since the last release. I'm having good results with 2017.10 with the same set of nodes and the same tests. The RPL graph also seems to be the same, but I'm not 100% sure.

kaspar030 commented 6 years ago

I'm struggling a lot with the RPL tests.

Can you elaborate what does not work? Maybe you're hitting a similar problem as in 05- on native. Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

kYc0o commented 6 years ago

Can you elaborate what does not work?

Single hop routes work sometimes, more than one hop it definitely doesn't work.

Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

I'm almost sure that's not the problem but I'll try and report back.

kYc0o commented 6 years ago

Try adding CFLAGS += -DGNRC_NETIF_IPV6_GROUPS_NUMOF=8 to the Makefile.

I'm almost sure that's not the problem but I'll try and report back.

Just tested, the (unsuccessful) result is the same.

cladmi commented 6 years ago

My tests failing only on iotlab-a8-m3 where an issue in my test runner (https://github.com/RIOT-OS/RIOT/pull/9011#issuecomment-386370556). They are now just failing on the same as m3 nodes.

miri64 commented 6 years ago

@kYc0o can you test with https://github.com/RIOT-OS/RIOT/pull/9073. Also make sure both the NIB off-link store (GNRC_IPV6_NIB_OFFL_NUMOF, for route destinations) and NIB on-link store (GNRC_IPV6_NIB_NUMOF, for route next-hops) is sufficiently large. For RPL to work properly GNRC_IPV6_NIB_OFFL_NUMOF should be at least #nodes+1.

aabadie commented 6 years ago

Tasks 4.7 and 4.8 are marked failed but I re-ran them on iotlab and got:

So they pass. Who tested them ? I'm using the following command lines on the Arduino Zero node:

bergzand commented 6 years ago

@miri64 I've tested a bit with 3 native instances (gnrc_networking) in a chain-like topology. Traffic between tap0 and tap2 is dropped to simulate a topology where node0 and node2 can communicate with node1, but node0 and node2 can't communicate.

Pinging between node0 and node2 doesn't work with both RPL and manual routing. What I observe is that when pinging from node0 to node2 is that node0 starts doing neighbour solicitations for the address of node2.

As soon as I remove the prefix from all nodes with nib prefix del 6 2001:db8::/64 ping6 works again with multiple hops.

miri64 commented 6 years ago

@bergzand @cgundogan I analyzed this with a debugger. Problem is, that routers in Ethernet configuration (IMHO correctly because we usually don't have a weird "mesh wants to share a prefix multihop"-situation there) advertise prefixes they configured as on-link using the L-flag in the PIO (this happens here). This causes the node to look up the address in the neighbor cache (as described in RFC 4861) instead of the forwarding table. So I rather would say that either the scenario as described by you is invalid or we should have some extra state for RPL(-like) network interfaces, so that mesh-wide prefixing is allowed. But that is more a feature than a bug fix.

miri64 commented 6 years ago

So won't fix, when it comes to that ;-).

cgundogan commented 6 years ago

So won't fix, when it comes to that ;-).

Then we should say that we do not support multi-hop in native anymore, be it with RPL or not? It at least worked in previous RIOT versions so we are dropping that feature (for now)?

bergzand commented 6 years ago

Thanks for clarifying! Indeed a wontfix then. I can replace the info on the wiki to build these topologies with something zep_mesh based as soon as I have a real keyboard again.

miri64 commented 6 years ago

Then we should say that we do not support multi-hop in native anymore, be it with RPL or not? It at least worked in previous RIOT versions so we are dropping that feature (for now)?

we support Multihop in native. Just not multihop prefix delegation. And that afaik was the case before as well and if not it was most definitely acting erroneously (so it was a bug, not a feature :P).

kaspar030 commented 6 years ago

I've lost track. Can someone sum up what's the state with RPL and multihop regarding the release?

kYc0o commented 6 years ago

The two of RPL tests passed successfully. It was a matter of configuration since the max nib entries was too small for my experiment. Moreover, since the topology was a bit dense and maybe noisy some of the DAOs were lost during initialisation, therefore I decreased the DAOs send interval. Also Martine's #9073 helped to keep a coherent nib for that small interval (10 secs).

kYc0o commented 6 years ago

Tasks 4.7 and 4.8 are marked failed but I re-ran them on iotlab and got:

For some strange reason, in the tag 2018.04-RC2 I have the following

2018-05-07 12:26:16,104 - INFO #  main(): This is RIOT! (Version: 2018.04-snake.local-HEAD)
2018-05-07 12:26:16,108 - INFO # RIOT network stack example application
2018-05-07 12:26:16,110 - INFO # All up, running the shell now
ifconfig
2018-05-07 12:26:19,982 - INFO #  ifconfig
2018-05-07 12:26:20,031 - INFO # Iface  7  HWaddr: f8:54  Channel: 26  NID: 0x23
2018-05-07 12:26:20,033 - INFO #           Long HWaddr: 00:13:a2:00:40:a9:f8:d4 
2018-05-07 12:26:20,034 - INFO #           MTU:100  HL:64  RTR  
2018-05-07 12:26:20,036 - INFO #           Source address length: 8
2018-05-07 12:26:20,038 - INFO #           Link type: wireless
2018-05-07 12:26:20,043 - INFO #           inet6 addr: fe80::  scope: local  VAL
2018-05-07 12:26:20,046 - INFO #           inet6 group: ff02::2
2018-05-07 12:26:20,048 - INFO #           inet6 group: ff02::1
2018-05-07 12:26:20,052 - INFO #           inet6 group: ff02::301:ff00:0
2018-05-07 12:26:20,055 - INFO #           inet6 group: ff02::1a
2018-05-07 12:26:20,056 - INFO #           
2018-05-07 12:26:20,061 - INFO #            Protocol or device doesn't provide statistics.
2018-05-07 12:26:20,063 - INFO #           Statistics for IPv6
2018-05-07 12:26:20,071 - INFO #             RX packets 0  bytes 0
2018-05-07 12:26:20,076 - INFO #             TX packets 3 (Multicast: 3)  bytes 162
2018-05-07 12:26:20,081 - INFO #             TX succeeded 3 errors 0
2018-05-07 12:26:20,082 - INFO # 

And on branch 2018.04-devel

2018-05-07 12:24:20,237 - INFO #  main(): This is RIOT! (Version: 2018.07-devel-1000-g7ea2b-snake.local-HEAD)
2018-05-07 12:24:20,240 - INFO # RIOT network stack example application
2018-05-07 12:24:20,243 - INFO # All up, running the shell now
> ifconfig
2018-05-07 12:24:25,500 - INFO #  ifconfig
2018-05-07 12:24:25,541 - INFO # Iface  7  HWaddr: f8:54  Channel: 26  NID: 0x23
2018-05-07 12:24:25,544 - INFO #           Long HWaddr: 00:13:a2:00:40:a9:f8:d4 
2018-05-07 12:24:25,548 - INFO #           MTU:1280  HL:64  RTR  
2018-05-07 12:24:25,549 - INFO #           IPHC  
2018-05-07 12:24:25,553 - INFO #           Source address length: 8
2018-05-07 12:24:25,555 - INFO #           Link type: wireless
2018-05-07 12:24:25,564 - INFO #           inet6 addr: fe80::213:a200:40a9:f8d4  scope: local  VAL
2018-05-07 12:24:25,565 - INFO #           inet6 group: ff02::2
2018-05-07 12:24:25,567 - INFO #           inet6 group: ff02::1
2018-05-07 12:24:25,570 - INFO #           inet6 group: ff02::301:ffa9:f8d4
2018-05-07 12:24:25,574 - INFO #           inet6 group: ff02::1a
2018-05-07 12:24:25,577 - INFO #           
2018-05-07 12:24:25,589 - INFO #            Protocol or device doesn't provide statistics.
2018-05-07 12:24:25,590 - INFO #           Statistics for IPv6
2018-05-07 12:24:25,591 - INFO #             RX packets 0  bytes 0
2018-05-07 12:24:25,594 - INFO #             TX packets 3 (Multicast: 3)  bytes 178
2018-05-07 12:24:25,595 - INFO #             TX succeeded 3 errors 0
2018-05-07 12:24:25,595 - INFO # 

That's why my tests failed.

kaspar030 commented 6 years ago

So RPL basically works, but with the default settings, only with up to 8 nodes?

kYc0o commented 6 years ago

So RPL basically works, but with the default settings, only with up to 8 nodes?

Exactly, the maximum number of neighbours is defined as:

sys/include/net/gnrc/ipv6/nib/conf.h:#define GNRC_IPV6_NIB_OFFL_NUMOF (8)

kaspar030 commented 6 years ago

Ok, apart from that, we're missing:

gogogo! ;)

kaspar030 commented 6 years ago

@kaspar030, HIL reports are still pointing to a non existing page, see here for example.

That's fixed now, see master or 2018.04-branch.

cladmi commented 6 years ago

For task 2, I added https://github.com/RIOT-OS/RIOT/pull/9007 to enable more TEST_ON_CI_WHITELIST.

I re-ran all tests in tests/directory that are supported, with enough memory, and had the 'TEST_ON_CI_WHITELIST' tag (even 'native'). I did run on wsn430-v1_4, iotlab-m3, and iotlab-a8-m3, samr21-xpro, arduino-zero using IoT-LAB.

For wsn430-v1_4 I have the same errors as before plus posix_semaphore which requires a different configuration for the CPU and should be fixed after https://github.com/RIOT-OS/RIOT/pull/9081.

For others, I have the following errors:

kaspar030 commented 6 years ago

For wsn430-v1_4 I have the same errors as before

didn't we fix some of the tests?

cladmi commented 6 years ago

Yeah I was not precise enough in my report, I wanted to give more details on the other ones, my bad. Indeed the ones noted before as Tests fixed on wsn430, and the pkg_tiny-asn1 where fixed and backported. My report should more have been: micro-ecc is not merged in master, and none of the Board gets stuck during test have been fixed.

cladmi commented 6 years ago

Task 5.2

The test description says "default ipv6 stack" I do not know if it meant using examples/default but I could not as it does not have the nib shell commands so I ran it with gnrc_networking.

I had 100% ping delivery so for me it's a success.

That what I understood I needed to run:

Node Receiver

> ifconfig
   # get long inet6 addr: fe80::1711:6b10:65f6:bb1a  scope: local  VAL
> ifconfig 7 add unicast beef::1/64
ifconfig 7 add unicast beef::1/64
success: added beef::1/64 to interface 7
> nib route add 7 :: fe80::1711:6b10:65fc:b406
nib route add 7 :: fe80::1711:6b10:65fc:b406
> nib route
nib route
beef::/64 dev #7
default* via fe80::1711:6b10:65fc:b406 dev #7

Node Transmitter

> ifconfig
   # get long inet6 addr: fe80::1711:6b10:65fc:b406  scope: local  VAL
> ifconfig 7 add unicast affe::1/120
ifconfig 7 add unicast affe::1/120
success: added affe::1/120 to interface 7
> nib route add 7 :: fe80::1711:6b10:65f6:bb1a
nib route add 7 :: fe80::1711:6b10:65f6:bb1a
> nib route
nib route
affe::/120 dev #7
default* via fe80::1711:6b10:65f6:bb1a dev #7
> ping6 100 beef::1 1024 10
...
--- beef::1 ping statistics ---
100 packets transmitted, 100 received, 0% packet loss, time 15.06791147 s
rtt min/avg/max = 131.097/145.808/160.756 ms

Is this what was expected and should I add this output example to the release spec ?

cladmi commented 6 years ago

The automated scripts for task7 do not work out of the box, I try to find what happens. Something with an expect line.

cladmi commented 6 years ago

Problem is that when doing ifconfig, addresses are now displayed without the /prefix_len (I hacked the script to not expect it)

Also, the test is using fibroute in setFibRoutesInARow shell: command not found: fibroute.

I see to fix it.

cladmi commented 6 years ago

Task 7.01

I adapted the IOTLABHelper.py script to the new commands and output to make the test work.

Successfully pinged with 3 hops
SUCCESS
cladmi commented 6 years ago

Task 7.02

Failed with first selected node but worked on the second with the same IOTLABHelper.py changes.

Sent successfully with packet loss of 0.0%
Successfully communicated with UDP over 3 hops
SUCCESS
kYc0o commented 6 years ago

I'm on 8.2

kYc0o commented 6 years ago

8.9 done successfully.

cladmi commented 6 years ago

Thank all for testing. Please remind me next week if I forget to upstream the changes for task 07.

miri64 commented 6 years ago

(closing this one, since we are currently testing 2018.07 ;-))