Closed robert-hh closed 3 years ago
some context:
https://forum.pycom.io/topic/6734/external-flash-lose-files/87?_=1615194014360
b) During that wear levelling test, I had core dumps at about every 50_000 writes. They were related to the heartbeat flash. I made a PR which changed that heartbeat timing a little bit. After that, the crashes disappeared, at least in that test. I am not 100% confident that the change cured the initial problem, which could also be a Core-0/Core-1 collision. But the change is also not intrusive and saves a few clock cycles. PR here: https://github.com/pycom/pycom-micropython-sigfox/pull/525
Hello! Could you please detail why this fix is needed? What issue does this fix? Thanks ! Is it related to https://github.com/pycom/pycom-micropython-sigfox/issues/518 ?
Is it related to #518 ?
Yes. Analyzing the backtraces of this fails, it looked like a race condition which involved the code doing the heartbeat and the code, closing a file. The previous code reported a new time before the actual color change was done. The change moves registering the new time after the color has changed. Before the change, the code would crash about every few hours in average, After that, the code ran fine for a week, after which is was stopped manually. It was the wear leveling test. The device with this change code did 4 Million cycles. another one 10 Million file create/write/closes. And, as a slight performance improvement, it performs the assignment and subtraction only when needed. I am definitely not sure about the mechanism. So the change may only hide the real reason.
P.S.: I had another crash case caused by the heartbeat led, which disappeared after I disabled it. But I did not look into it yet.
Thanks for the contribution, this change will be part of an upcoming release.
The place at which the time for the next transition is noticed moves from the start of the respective block to it's end, when the RGB led has switched. The effect on heartbeat timing is minor. Without load, the heartbeat 'on' and 'off' times are identical for the old and new version. Timing with load new version. The first pulse at -200ms is the 'on' command for the RGB, the scattered second pulses are the 'off' commands accumulated over 12 hours.
Timing with load old version.