candle-usb / candleLight_fw

gs_usb compatible firmware for candleLight, cantact and canable
Other
659 stars 291 forks source link

when 32 bit ms counter folded, led may be ON by mistake? how to avoid that? #77

Closed elanwu closed 2 years ago

elanwu commented 2 years ago

https://github.com/candle-usb/candleLight_fw/blob/e6b972441b767b8952d9cd8c78cd74a43be091a9/src/led.c#L94-L98

49.7 days, after that, folding will happens on an uint32_t counter, ticking every 1ms. 2 ** 32 / 1000 / 60 / 60 / 24 = 49.7 Days

Image this 2 cases:

assert( now == 0x0000'FFF3 );
off_until = now + 0x80;
assert( off_until == 0x0001'0073 );
assert( (off_until  < now) == false);

assert( now == 0xFFFF'FFF3 );
off_until = now + 0x80;  // folding happens
assert( off_until == 0x0000'007 );
assert( (off_until  < now) == true);

led may be ON by mistake when foding happed in case 2.

May this worry realy happens in chance? Any solution, pattern to solve this folding problem, other than extend 32 bit counter to an endless 64 bit one?

How about:

bool is_former_nearly_beyound_by_later(uint32_t former, uint32_t later);
led_set(led, is_former_near_beyound_by_later(led->off_until, now));   

"nearly" means diff is not more than half of UINT32_MAX, namely 24.85 days.

bool is_former_near_beyound_by_later(uint32_t former, uint32_t later)
{
    uint32_t diff = later - former;
    return (diff & 0x8000'0000) == 0 ? true : false;
}

We have cons now: 1) half duration range; 2) slower beyounding detection; however archived "zero mistake chance".

Some experimens shows that folding will not case error any more, after some folding resist way applyed: https://onlinegdb.com/D2ZBIRjp3

Cmake scripts, clear code in main loop, led ... candleLight shows an good study example, I learned a lot from this amazing project. Thanks!

fenugrec commented 2 years ago

Thanks. You are right, there is a bug there. I think a better way would be to cast the subtraction result to a signed type, if ((int32_t) (now - led->off_until) >= 0)

a bit similar to what you did but IMO a bit cleaner. but also need to guarantee that off_until is never set to more than INT32_MAX in the future (25 days), which should be straightforward to enforce.

daniel-santos commented 2 years ago

Yes, wrap around happens. This is a common occurrence in the Linux kernel and even must be considered for transient time values spanning years -- a Linux machine might be up for that long! In such cases, it's better to just think of the counter/timer value as meaningless until compared (via subtraction) to a target time/count and, as @fenugrec illustrated, treat them as signed values -- only worry about the difference, and you'll stay out of trouble.

Also, signed math is far fewer instructions. You just have to change the way you are thinking about your timers and time values.

fenugrec commented 2 years ago

yes. Will fix when I get a chance; leaving this open as a reminder

fenugrec commented 2 years ago

I cobbled up a fix (no time to test on hardware yet but compiles) in PR #78 , if anyone would like to review / test

fenugrec commented 2 years ago

Updated the fix in #78, and tested on hardware. Looks fine, but feedback would be appreciated.