Closed bsiever closed 4 years ago
Implemented and testing. Example data point so far: One "failure" cause over 7000 retries in a loop. I was expecting 1 retry waiting for the "carry", so this problem is a little deeper than anticipated (the difference could be due to unit conversions and/or multiple levels of counters rolling over).
I may add in a fiber_sleep(1);
in the loop rather than polling, but it isn't clear this is worthwhile. If it's always about 7k iterations or under, that's not much time.
Update: 2 errors in ~50 minutes. Both were around 7000 loops before it was resolved.
Both happened on the device that was using the forever-loop to continuously poll for time (another device that only does time at the 2s updates hasn't had any problems in 50 min).
So far:
More data:
With polling loop it occurred 5 times in 2:05 (125 minutes or about every 24 minutes). The ranges of the "count" (number of calls in loop) were 7349-7357. Very very tight range. So it seems like the "rollover" MAY have a bounded time.
Polling with a fiber_sleep(1)
between checks seems to still "fail" about every 90 minutes and does 6-7 retries (i.e., maybe the overhead of fiber_sleep()
in this specific code might be 1000x just a loop of polling the time???).
I've tested, reviewed, reworked....I think this is resolved or has minor and very very rare impact.
Update the cpu time function to return the correct time via a loop (rather than using the old time when there's a failure)