Closed zsrkmyn closed 6 years ago
Thanks for reporting. This is a challenging simulation environment, large and very sparse (just 4 nbr max) yet with significant interference range (about 20 nbr).
The problem seems to be with CSMA in Cooja Motes. In Cooja motes, transmission timing is coarse-grained, with only 1ms precision. The current CSMA timings result in re-transmissions with very low backoff exponent. Combine the two above and you get cascaded collisions of nodes colliding and re-trying every 1ms.
With some work on CSMA I got your simulation file work reliably. What I did:
CSMA_MIN_BE
to 3 and CSMA_MAX_BE
to 5. This is what the IEEE standard stipulates anyway. I believe the reason we've been using different values were mostly ContikiMAC related, and I think we should adopt the standard default instead.backoff_period
: increased by 20 (or more). This is more specific to Cooja motes. Helps spread out re-transmissions and in turn decrease contention.BTW, note that RPL_NS_CONF_LINK_NUM
is now obsolete (will update the doc), please use NETSTACK_MAX_ROUTE_ENTRIES
instead (but the default on Cooja motes is 300, so this was not the issue).
New defaults proposed in https://github.com/contiki-ng/contiki-ng/pull/315
The PR improves the performance, but the network is still not as good as the 3.x version.
We set LOG_CONF_WITH_ANNOTATE
to 1
in project-conf.h
, and change the uip-ds6-route.c
as follow and enable Mote relations
in cooja to show default routes of nodes in cooja.
diff --git a/os/net/ipv6/uip-ds6-route.c b/os/net/ipv6/uip-ds6-route.c
index 9321b632c..953b677cd 100644
--- a/os/net/ipv6/uip-ds6-route.c
+++ b/os/net/ipv6/uip-ds6-route.c
@@ -637,7 +637,7 @@ uip_ds6_defrt_add(uip_ipaddr_t *ipaddr, unsigned long interval)
d->isinfinite = 1;
}
- LOG_ANNOTATE("#L %u 1\n", ipaddr->u8[sizeof(uip_ipaddr_t) - 1]);
+ LOG_ANNOTATE("#L %u 1;red\n", ipaddr->u8[sizeof(uip_ipaddr_t) - 1]);
#if UIP_DS6_NOTIFICATIONS
call_route_callback(UIP_DS6_NOTIFICATION_DEFRT_ADD, ipaddr, ipaddr);
We test the rpl-lite on both cooja and sky motes (tesing on sky mote can be very slow :-( ) , it seems the default routes change frequently and wrongly.
In contiki 3.x, we set RPL_CONF_MOP
to RPL_MOP_NON_STORING
to enable non-storing mode of RPL, and the default routes are more stale and reasonable.
I'd like to record some videos to elaborate the issue if I am free this evening. XD
cc @dongdongbh
Ok it's most likely that the defaults in RPL lite don't do great in your network. First thing that would come to my mind is try disabling the ETX squaring (link-stats module). It helps build reliable routes but in also makes the DAG more jittery. Might be sub-optimal in certain scenarios.
BTW have you tried rpl-classic (in non-storing mode), for the sake of finding out if the issue came with rpl-lite or contiki-ng?
I have tested three cases on my network distribution as following:
rpl-classic
and non storing modeThe performance is 1>2>3, In first condition, the network set up is fast and also more reliable than the other two setting. All nodes in the network can reach the root node. In second condition, only part of nodes can reach the root, and the routing decision also not good for communication. In the last condition, it works worse, almost nodes can not reach the root node.
According my test, it seems that it is contiki-ng's issue, also rpl-lite
work worse than rpl-classic
in our network.
Here are the screenshot of three condition results.
cc @zsrkmyn
You must be doing something wrong. On a clean repo I get a fully connected network at after 2m30s. There are very few if any parent switch after that. I see no traffic loss at all: all application traffic is received successfully, up, and then back down (I get one hundred "Received response 1", one hundred "Received response 2" etc.).
I thought you were reporting on sub-optimal performance, not something as broken as in the screenshots above...
I used the branch with the CSMA fix with added annotations but no other modif. Latest Cooja and MSPSim from the Contiki-NG repo. All running in Docker.
Oh, I'm sorry. We first worked on commit 9777ac4 but found the network was not as stable as an older commit, so we checked out to an older commit and continued working on it, and we all forgot that our working directory was not updated with upstream.
After swiching to the latest commit and applying the csma-defaults patch, all problems disappear.
I really appreciate your help! :-)
Ah that is very good to hear :) Thanks again for reporting, got the CSMA issue noticed!
We use
examples/rpl-udp/udp-{server,cilent}.c
to test RPL on cooja mote.In simulation, we add 100 clients and 1 server. The
csc
file can be found at https://pastebin.com/93WcJT2uWe use rpl-lite as the routing protocol, and set
RPL_NS_CONF_LINK_NUM
to 120 inproject-conf.h
.When setting
RPL_CONF_PROBING_SEND_FUNC
torpl_icmp6_dio_output
, after 1 hours in simulation time, only several nodes around the root can join the RPL instance. SettingRPL_CONF_PROBING_SEND_FUNC
torpl_icmp6_dis_output
makes it little better.However, when we run
examples/simple-udp-rpl
in the contiki 3.x with the same distribution of these motes, it takes only 4 minutes in simulation time for the root to recieve the first message from the farest nodes.We also dig out the log of RPL modules, but there are so many unreasonable packet transmissions and parent switchs that we are hard to describle them here. And the log file is larger than 100MiB, so we'll upload it if you need.
Feel free to ask us for more details! XD