joan2937 / lg

Linux C libraries and Python modules for manipulating GPIO
The Unlicense
57 stars 20 forks source link

Debouncing not working - comparing CLOCK_MONOTONIC with CLOCK_REALTIME #19

Open matthijskooijman opened 1 year ago

matthijskooijman commented 1 year ago

I previously submitted this at https://bugs.launchpad.net/ubuntu/+source/lg-gpio/+bug/1972985 because somehow I had forgotten this git repo exists back then (and http://abyz.me.uk/lg/index.html does not point to here at all). Since this is really an upstream bug, I'm going to resubmit it verbatim here.

The problem is that when I set a debounce time on an input pin (with a button attached), that debounce time is not actually applied. If I set a 0.5s timeout, input events are registered immediately (and a lot of them in succession if I press the button a lot of times), instead of only reporting changes when the pin has been stable for 0.5s.

To reproduce, I have used the attached program (monitor.c), which is exactly the "monitor GPIO" example from http://abyz.me.uk/lg/examples.html#C%20lgpio with the following line added:

        lgGpioSetDebounce(h, g, 500000);

This line sets a bounce time of 500,000 μs (or 0.5 second).

When I compile and run it, and press a button attached to gpio 7 a few times quickly, I get:

$ gcc -Wall -o monitor monitor.c -llgpio
$ ./monitor 7
chip=0 gpio=7
chip=0 gpio=7 level=1 time=116.935984428
chip=0 gpio=7 level=0 time=122.184483056
chip=0 gpio=7 level=1 time=122.311010973
chip=0 gpio=7 level=0 time=122.385752598
chip=0 gpio=7 level=1 time=122.495776931
chip=0 gpio=7 level=0 time=122.536279723
chip=0 gpio=7 level=1 time=122.634267723
^C

Note that the timestamps are a lot less than 500ms apart, which should not be possible if the debouncing was working.

I investigated more closely, using the most recent version (downloaded from http://abyz.me.uk/lg/lg.zip a few days ago, I think it's 0.2.0.0) with some debug output added. I found that:

I'm not sure if changing lguTimestamp() is the right fix here, since it might be used elsewhere that does need CLOCK_REALTIME (I haven't checked closely).

The gpio docs (https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/gpio.h#n273) say:

 * By default the @timestamp_ns is read from %CLOCK_MONOTONIC and is
 * intended to allow the accurate measurement of the time between events.
 * It does not provide the wall-clock time.
 *
 * If the %GPIO_V2_LINE_FLAG_EVENT_CLOCK_REALTIME flag is set then the
 * @timestamp_ns is read from %CLOCK_REALTIME.

So maybe another fix is to always set the GPIO_V2_LINE_FLAG_EVENT_CLOCK_REALTIME. It seems there is some commented out to set this flag (based on, I think, user-supplied flags) already: https://salsa.debian.org/python-team/packages/lg-gpio/-/blob/upstream/0.2.0.0/lgGpio.c#L244-245

Assuming that debouncing has been tested to work when this code was originally written, I wonder if the default gpio clock was changed at some point, the clock used by lgTimestamp was changed, or maybe this was originally tested on hardware where both clocks are identical? I would dig in a bit deeper, but lacking time, I'm going to just leave this here.

I've been testing this on an Orange Pi PC board, running a Ubuntu Jammy-based Armbian image.

joan2937 commented 1 year ago

Thanks for looking into this. I know I assumed I was using the same clock as the kernel. My assumption appears to be wrong.

joan2937 commented 1 year ago

This looks like a straightforward error which I should have noticed.

I have added an internal function in lgPthAlerts as follows.

uint64_t xMonotonicTimestamp(void)
{
   struct timespec xts;

   clock_gettime(CLOCK_MONOTONIC, &xts); // get current time

   return ((uint64_t)1E9 * xts.tv_sec) + xts.tv_nsec;
}

I will use this for xNowTimestamp to make the times consistent.

lguTimestamp will be unchanged. I remember now the assumption I made is that lguTimestamp would return a result with the same base as date +%s

joan2937 commented 1 year ago

This appears to be a change in the kernel gpiolib.c driver. Back in Feb 2020 the timestamp was changed from realtime to monotonic.

https://github.com/torvalds/linux/commit/f8850206e160bfe35de9ca2e726ab6d6b8cb77dd

I have no idea when that change was implemented in the Raspberry Pi default kernel which is my baseline.

Given that my documentation and examples all show 1970 epoch timestamps that was what the software was written for and tested against.

I am beginning to really dislike the gpiochip developers. Yes, monotonic makes more sense. The change should have been loudly publicised.

I am going to have to think some more about this. lg daemon users may be expecting a consistent timestamp from multiple machines.

matthijskooijman commented 8 months ago

I am beginning to really dislike the gpiochip developers. Yes, monotonic makes more sense. The change should have been loudly publicised.

Yeah, I guess they assumed that users would only be comparing timestamps between different GPIO events, not realizing that one might also want to compare with an external clock...

I had a quick look at your improved implementation, and IIUC you now always use timestamps in the gpiod clock (whatever it is), and then to get the current time in the same clock, you use the monotonic clock and adjust the last gpiod event timestamp with the time passed (on the monotonic clock) since that last gpiod event. This adds a bit of inaccuracy (since the event will take some time to reach lgpio), but the timestamp of such a debounced event is already delayed anyway, so that should be fine.

It does seem that the debouncing code becomes even more complex from this change, which is a pity, but well...

Looking at the commit tagging, it seems this change is available in version 0.2.2. I do not have an easy way to actually test this right now, unfortunately (Ubuntu still only has 0.2.0).

DanielDecker commented 3 months ago

I'm not sure if this issue is only c related. But for the python version 0.2.2 (lgpio) I can confirm that debounce works as described in the documentation. Tested on Raspberry Pi 3 running Debian 12 Bookworm (Raspberry Pi OS) kernel 6.6

warthog618 commented 2 months ago

I am beginning to really dislike the gpiochip developers. Yes, monotonic makes more sense. The change should have been loudly publicised.

And you don't even talk to us, much less know us. Perhaps reserve judgement until you do?

The change was discussed on the GPIO mailing list, which the commit you reference itself references. That is where the kernel GPIO changes are publicised. You, being a major Linux GPIO user, should keep an eye on that and pipe up if you have objections or suggestions. You didn't, and that is on you. OTOH, kernel policy is to never break ABI, but that did, so I agree that change should've been made opt-in.

You've switched over to using uAPI v2 now, right?

The clock returned in edge events was made configurable in uAPI v2 to support various clock sources, as there are some users that still want realtime. So take your pick.

Oh, and uAPI v2 supports debounce, either in hardware where supported or via software emulation in the kernel. You might want to switch to using that rather than debouncing in userspace. That may also solve #18 which may be caused by the kernel event buffer overflowing - if the kernel does the debouncing then only the debounced events hit the buffer. You can confirm if overflow has occurred by examining the event sequence numbers. When an overflow occurs the oldest events are discarded, so generally it wouldn't alter the outcome of a debounce in userspace, but if the button is being mashed and userspace is hung up on something you might lose a key event...