espruino / Espruino

The Espruino JavaScript interpreter - Official Repo
http://www.espruino.com/
Other
2.77k stars 745 forks source link

[bug] Changing the BLE uart state after ~5mins leads to a reboot. #1933

Closed enaon closed 4 years ago

enaon commented 4 years ago

Hello again, I am not sure again if this is a bug really, but it has been troubling me for some time and I finally pinpointed how to reproduce it. Tested in sdk11 .129 firmware and sdk12 .129 firmware, but it was there before. Those are builds for the dsd6 smartband, but @fanoush said that if the bug is there on the sdk12 build, then most propably it affects all espruinos, so I thought I report it.

How to reproduce:

1.Reboot the espruino and connect through serial, change the console to serial allways (that part is not needed if you have another way to disable/enable BLE uart), and disable/enable the BLE uart like so.

NRF.setServices(undefined,{uart:false}); NRF.setServices(undefined,{uart:true}); (or with {} instead of undefined)

2.Wait at least 4 minutes, 6 for 100% success :) (strange I know, but it needs ~5 minutes from last change) 3.Disable/enable the uart again, and in a few secs(not instantly) the board will reboot, giving no errors in console.

ps. I have a watchdog running too, I am not sure if it makes a difference, I only report that because the reboot seems not instant, like the watchdog called it some seconds after the last uart change, but the board is responsive between the last uart change and the reboot.

gfwilliams commented 4 years ago

Thanks - so to reproduce this you call both setServices in one line? Or is just one of them enough to do it?

enaon commented 4 years ago

no one line is not needed, I do it line by line. disable it/enable it/wait 5-6 minutes/disable again, that should be enouph.

gfwilliams commented 4 years ago

Ok, and when does it crash?

NRF.setServices(undefined,{uart:false}); 
NRF.setServices(undefined,{uart:true});
// wait 5-6 mins
NRF.setServices(undefined,{uart:false}); // here?
NRF.setServices(undefined,{uart:true}); // here?
enaon commented 4 years ago

disable/enable/wait/disable=reboot after a few secs.

enaon commented 4 years ago

No connection to BT durring that time, on the p8 I do it using the gui, on the dsd6 using serial1.

fanoush commented 4 years ago

I tried it today morning with DS-D6, SDK12 build 2.06.129 but cannot replicate it. tried like 3 times. Over serial with no bluetooth connection run that line with both uart false/true, had longer intervals like 10,15 minutes. Also I had other watch code there. Also had no watchdog enabled, do you have watchdog in manual mode, with your own setInterval kicking it? What watchdog interval and setInterval values you used?

If console still works all the time until reboot and you use manual watchdog, maybe the setInterval watchdog kicking stops firing or its timing gets wrong so it fires too late.

enaon commented 4 years ago

NIce, I will try without a watchdog. I checked the timer, I have some time bebore it reboots, the watchdog's timer was running, but did not check the time.

thos are the values for the watchdog:

function KickWd(){
  if(!BTN1.read())E.kickWatchdog();
}
var wdint=setInterval(KickWd,1000);
E.enableWatchdog(6,false)
fanoush commented 4 years ago

will try with that btw, adding some print to the kicking interval with current time could help to see when/if it stops firing before it hangs

enaon commented 4 years ago

ok. you were correct. Without a watchdog all is ok,

With the watchdog, the timer stops, and the watchdog forces a reboot when I change the uart after a few minutes. The timer itself ir running though. I can see the id change in global/timers.


Date: Thu Jan 1 1970 00:00:38 GMT+0000
>NRF.setServices(undefined,{uart:false});
=undefined
Date: Thu Jan 1 1970 00:00:39 GMT+0000
...
>NRF.setServices(undefined,{uart:true});
=undefined
Date: Thu Jan 1 1970 00:00:49 GMT+0000
...
Date: Thu Jan 1 1970 00:08:07 GMT+0000
Date: Thu Jan 1 1970 00:08:08 GMT+0000
Date: Thu Jan 1 1970 00:08:09 GMT+0000
Date: Thu Jan 1 1970 00:08:10 GMT+0000
>NRF.setServices(undefined,{uart:false});
=undefined
>
 ____                 _
|  __|___ ___ ___ _ _|_|___ ___
|  __|_ -| . |  _| | | |   | . |
|____|___|  _|_| |___|_|_|_|___|
         |_| espruino.com
 2v06.129 (c) 2019 G.Williams
gfwilliams commented 4 years ago

Ok, that's interesting.

So with no extra code, and no watchdog, if you just do a setInterval(...print...) then the interval stops being called after the second setServices - but only if around 5 mins have passed first?

enaon commented 4 years ago

Yes, empty code, With no watchdog, I see no problems, maybe I should look closer.

With the watchdog, a reboot after a change, only after around 5 mins have passed yes. Indeed the time do not show, so the timer stoped it seems, but I am allmost sure I checked and the id was changing, I will recheck.

Can you please try to reproduse it, with the watchdog on?

enaon commented 4 years ago

yes, the time goes back as you said, that is why the timer is still alive but is not firing.

Date: Thu Jan 1 1970 00:05:14 GMT+0000
Date: Thu Jan 1 1970 00:05:15 GMT+0000
Date: Thu Jan 1 1970 00:05:16 GMT+0000
Date: Thu Jan 1 1970 00:05:17 GMT+0000
Date: Thu Jan 1 1970 00:05:18 GMT+0000
Date: Thu Jan 1 1970 00:05:19 GMT+0000
Date: Thu Jan 1 1970 00:05:20 GMT+0000
Date: Thu Jan 1 1970 00:05:21 GMT+0000
>NRF.setServices(undefined,{uart:false});print(Date());
Date: Wed Dec 31 1969 23:56:49 GMT+0000
=undefined
>
 ____                 _
|  __|___ ___ ___ _ _|_|___ ___
|  __|_ -| . |  _| | | |   | . |
|____|___|  _|_| |___|_|_|_|___|
         |_| espruino.com
 2v06.129 (c) 2019 G.Williams
enaon commented 4 years ago

ok so the title needs changing, if this indeed happens to all espruinos.

the uart change causes time drift, which causes timers to not work.

This is on a clean system after E.reboot(), no watchdog,

Espruino is Open Source. Our work is supported
only by sales of official boards and donations:
http://espruino.com/Donate
>
>NRF.setServices(undefined,{uart:false});print(Date());
Date: Thu Jan 1 1970 00:00:11 GMT+0000
=undefined
>NRF.setServices(undefined,{uart:true});print(Date());
Date: Thu Jan 1 1970 00:00:20 GMT+0000
=undefined
>print(Date());
Date: Thu Jan 1 1970 00:03:43 GMT+0000
=undefined
>print(Date());
Date: Thu Jan 1 1970 00:06:40 GMT+0000
=undefined
>NRF.setServices(undefined,{uart:false});print(Date());
Date: Wed Dec 31 1969 23:58:14 GMT+0000
=undefined
enaon commented 4 years ago

ok, I found a way to walk around it, but it is strange.

If during the first 5 minutes I call NRF.setServices(undefined,{uart:true}); every minute or so, then the problem dissapears, meaning that if I start waiting at the 5th minute, when I call setServices again at the 15th minute, the problem is not there anymore.

gfwilliams commented 4 years ago

Ok, thanks. I think I've got a good idea what the issue is - it depends on whether the top bit of the 24 bit RTC is set when the softdevice restarts :)

0x800000 / (32768*60) = 4 minutes 15 seconds or more

enaon commented 4 years ago

nice, thanks :)

gfwilliams commented 4 years ago

Thanks - thinking about it I reckon this could fix some instability issues on Bangle.js as well since we have the watchdog on that.

enaon commented 4 years ago

it was driving me nuts, It was messing up the gui, now I know it is because of timers gone wild, and was rebooting the p8, but it looked so random, and also it could be avoided if one played with the on/off for some time to test, that I am really happy you found it. :)

Now that I think of it, I have time drift often when I am using it as a EUC client, this is why I wanted the connection to the phone to get the time and looked at gadgetbridge. If this is a fix for that too, it is excellent. :)

enaon commented 4 years ago

I am sure you know, but as a confirmation, I just tested and the bug is gone, thanks again :)