cetic / 6lbr

A deployment-ready 6LoWPAN Border Router solution based on Contiki
github.com/cetic/6lbr/wiki
Other
338 stars 196 forks source link

6lbr crashing during Cooja simulation #310

Open maximcherny opened 7 years ago

maximcherny commented 7 years ago

Hi,

I have a Cooja simulation with the Z1-based slip-radio connected to a 6lbr border router (from text example) and x5 Wismote-based motes (I can't use Z1 as I need more resources for the firmware) sending CoAP messages to an external host using the bridge interface.

While the simulation is running, sometimes I observe the following error messages in the Cooja console log:

[java] Watchdog: illegal write to WDTCTL (0) from $3fb4 - reset!!!!
[java] Stack Trace: number of calls: 8 PC: $03fb4
[java]   watchdog_reboot (uip6.c) called from PC: $03a72 (elapsed: 5)
[java]   slip_radio_cmd_handler (local in slip-radio.c) called from PC: $03cd8 (elapsed: 39)
[java]   cmd_input (uip6.c) called from PC: $0399e (elapsed: 75)
[java]   slip_input_callback (local in slip-radio.c) called from PC: $04f7a (elapsed: 85)
[java]   process_thread_slip_process (local in slip.c) called from PC: $05860 (elapsed: 235)
[java]   call_process (local in process.c) called from PC: $0592e (elapsed: 270)
[java]   do_poll (local in process.c) called from PC: $05970 (elapsed: 309)
[java]   process_run (uip6.c) called from PC: $03394 (elapsed: 328)
[java]  WARN [Thread-3] (MspMote.java:229) - 1: # Watchdog[EXECUTION]: illegal write to WDTCTL (0) from $3fb4 - reset!!!!
[java] Watchdog: illegal write to WDTCTL (0) from $3fb4 - reset!!!!
[java] Stack Trace: number of calls: 8 PC: $03fb4
[java]   watchdog_reboot (uip6.c) called from PC: $03a72 (elapsed: 5)
[java]   slip_radio_cmd_handler (local in slip-radio.c) called from PC: $03cd8 (elapsed: 39)
[java]   cmd_input (uip6.c) called from PC: $0399e (elapsed: 75)
[java]   slip_input_callback (local in slip-radio.c) called from PC: $04f7a (elapsed: 85)
[java]   process_thread_slip_process (local in slip.c) called from PC: $05860 (elapsed: 235)
[java]   call_process (local in process.c) called from PC: $0592e (elapsed: 270)
[java]   do_poll (local in process.c) called from PC: $05970 (elapsed: 309)
[java]   process_run (uip6.c) called from PC: $03394 (elapsed: 328)
[java]  WARN [Thread-3] (MspMote.java:229) - 1: # Watchdog[EXECUTION]: illegal write to WDTCTL (0) from $3fb4 - reset!!!!

I can see that these errors are generally associates with the ACK timeout on the slip radio side:

...
BR-RDC: br-rdc: send failed, slip ack timeout (1)
...

Are there any configuration or compilation options on the 6lbr side that can assist with handling the timeout issues more gracefully? Or increasing the timeout? And more importantly, what is usually indicated by this error?

Thanks.

darkopetrovic commented 7 years ago

I had a lot of trouble doing simulations like this. I don't know if this can work for you, but try to run the slip-radio on a wismote too.

maximcherny commented 7 years ago

I did some further investigations and it seems that if I disable the UDPCLIENT app that comes as part of the 6lbr-demo example, the issue goes away. I can also see that without it, the 6lbr router is no longer able to track the PRR and topology.

laurentderu commented 7 years ago

An ack timeout indicates that the BR has send a packet to the slip-radio, but the slip-radio never send it or an ack packet was never received. If you see a log message like "ack received for unknown packet" that means the timeout is too short. The root cause can be simply a packet collision and the packet was lost, or the input buffer of the slip-radio was full and the packet was dropped or the packet was corrupted and the slip-radio rejected it.

You can control the length of the timeout and the number of packet retransmission of the packet sent to the slip-radio using the configuration parameters of the new configuration file "nvm.conf" See https://github.com/cetic/6lbr/wiki/Native-Configuration-file

But I don't see immediately the link between the UDP client and the stability of the slip-radio.

maximcherny commented 7 years ago

Thanks, the slip timeout settings in the NVM config does help to eliminate the ack errors.

I am now trying to homogenise the setup and also use a Wismote mote for the slip-radio.

However, it seems that 6lbr slip-radio is not supported on Wismote, because it just keeps printing "Fetching MAC address" even after the simulation has been started:

********************************************************************************
Wed Jun 21 05:25:26 UTC 2017 : Starting 6LBR
********************************************************************************
Wed Jun 21 05:25:26 UTC 2017 : Starting 6LBR
../bin/cetic_6lbr_router -c conf-templates/test.dat -o ../package/etc/6lbr/nvm.conf -a localhost -R -t tap0  -U ../package/usr/lib/6lbr/6lbr-ifup -D ../package/usr/lib/6lbr/6lbr-ifdown -w ../package/usr/lib/6lbr/www   -W /var/log/6lbr.timestamp -P 60 -C /var/log/6lbr.ip -n conf-templates/node-config.conf -m ../package/usr/lib/6lbr/plugins   
Contiki-contiki-base-develop-20170120-2305-g98a3eac started with IPV6, RPL
Rime started with address 1.2.3.4.5.6.7.8
MAC mac_wrapper RDC br-rdc SEC llsec-wrapper NETWORK sicslowpan
2017-06-21 5:25:26.638840: NOTICE: 6LBR: Starting 6LBR version 1.5.x (Contiki-contiki-base-develop-20170120-2305-g98a3eac)
2017-06-21 5:25:26.639153: INFO: ETH: 6LBR watchdog started (interval: 60)
2017-06-21 5:25:26.640915: INFO: NVM: NVM Magic : 2009
2017-06-21 5:25:26.640933: INFO: NVM: NVM Version : 5
2017-06-21 5:25:26.640935: INFO: CONFIG: Loading configuration : ../package/etc/6lbr/nvm.conf
2017-06-21 5:25:26.641461: NOTICE: 6LBR: Log level: 30 (services: ffffffff)
2017-06-21 5:25:26.641479: INFO: MAC: Using 'csma' mac driver
2017-06-21 5:25:26.641830: INFO: SLIP: SLIP opened to localhost:60001
2017-06-21 5:25:26.641851: INFO: SCMD: Started br-cmd process
2017-06-21 5:25:26.641854: INFO: BR-RDC: Reset SLIP Radio
2017-06-21 5:25:26.641856: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:27.641377: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:28.648708: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:29.652806: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:30.653166: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:30.831852: INFO: SLIP: Rime started with address 0.0.0.0.0.0.0.1
2017-06-21 5:25:30.849078: INFO: SLIP: MAC 00:00:00:00:00:00:00:01 Contiki-contiki-base-develop-20170120-2305-g98a3eac started. Node id is set to 1.
2017-06-21 5:25:30.854781: INFO: SLIP: nullmac nullrdc, channel check rate 128 Hz, radio channel 26
2017-06-21 5:25:30.871457: INFO: SLIP: Tentative link-local IPv6 address fe80:0000:0000:0000:0200:0000:0000:0001
2017-06-21 5:25:30.890495: INFO: SLIP: Tentative global IPv6 address fd00:0000:0000:0000:0200:0000:0000:0001
2017-06-21 5:25:30.890518: INFO: SLIP: Starting 'Slip radio process'
2017-06-21 5:25:30.892054: INFO: SLIP: Slip Radio started...
2017-06-21 5:25:31.653454: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:32.653528: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:33.654394: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:34.659149: INFO: BR-RDC: Fetching MAC address
2017-06-21 5:25:35.662022: INFO: BR-RDC: Fetching MAC address

Why would it not be getting a MAC address on Wismote when the same setup works on a Z1?