Open andrewbrannan opened 8 years ago
Thanks for the detailed report.
If on()
gets called at line 967, then the value of we_are_receiving_burst
should be non-zero.
An rtimer interrupt will execute powercycle
, which yields at various points. If it yields L433, then the next time it gets called we_are_receiving_burst
will be non-zero, so it will yield again. I think this is not where on()
gets called for the 2nd time. powercycle
also yields at L483 and L514, so perhaps something there.
To switch the HF clock source to the HF XOSC, first we place a request for the XOSC to start up (oscillators_request_hf_xosc()
). While it's starting up we can do other things and, when it's ready, we perform the actual (blocking) switch (oscillators_switch_to_hf_xosc()
). Multiple requests to switch are OK as long as the XOSC is ready. So in theory the 2nd on()
(the one called from the interrupt) should work.
However, if within the interrupt context the code selects the RC as the HF clock source, the XOSC will be allowed to power down. This can happen if e.g. off
also gets called within the same interrupt. If this happens, then the clock source switch in the 1st on()
will never complete (because the XOSC is left powered down). My gut feeling is that within the interrupt context off()
also gets called. Can you double check please?
Having said all that, ContikiMAC tries to avoid calling NETSTACK_RADIO.on();
twice in a nested fashion. Look at ContikiMAC's on()
function. A second call to it will return early, because radio_is_on
will be 1. This makes me think that perhaps in the interrupt context ContikiMAC attempts to call NETSTACK_RADIO.channel_clear()
(which will in turn call on
within the radio driver).
Edit: I'm suspecting the channel_clear()
call in L455.
Thanks for uploading such a detailed analysis!
I'm facing similar reboot situation with ContikiMac and PropMode but on cc1350. Can anyone confirm if the above hack fixed the issue? Or if there is another solution for this?
Thanks.
@mdmobashir I have also tried the hack. But it does not seem to fix the issue here. DO you have any updates. I have the similar problem with cc1350
I've observed an issue with the CC1310 while using Contikimac that causes a device to randomly reboot under heavy traffic. The issue only exists when there is more than 1 node on a network and occurrences increase with the number of nodes on a network. I'm able to reproduce it as follows:
broadcast-example
program underexamples/ipv6/simple-udp-rpl
. Use contikimac, csma and all other default settings.SEND_INTERVAL
andSEND_TIME
to0.25*CLOCK_SECOND
.The problem seems to be contikimac specific. I've tracked it down the the
on()
function ofprop-mode.c
being called by contikimac while it's rxing a burst. After placing some strategic debug printouts, I've figured out the sequence of events that causes the reboots.prop-mode.c
'son()
from line 967 (if(we_are_receiving_burst) { on();.....
)on()
gets to somewhere between line 877 (oscillators_request_hf_xosc();
) and line 912 (oscillators_switch_to_hf_xosc();
).prop-mode.c
'son()
is called again from somewhere else. (!!!) It must be being called by an interrupt or rtimer somewhere if it's able to preempt execution, but I don't know enough about the contikimac protocol to tell exactly where the second call is coming from.on()
executes all the way through successfully.on()
resumes from wherever it was interrupted.on()
makes it to line 912 (oscillators_switch_to_hf_xosc();
) which has already been called a moment before by the second call toon()
.I'm able to prevent the issue entirely by placing a
ti_lib_int_master_disable();
inon()
before requesting the hf oscillators and then the correspondingti_lib_int_master_enable();
right before the switch. This feels like a hack though.It'd be great if somebody with a bit more familiarity with contikimac could have a look at this, seems like it's just a case of an interrupt or rtimer not being suppressed when the radio is already being turned on.
Wonder if this could also be related to #1878