[Feature] Mitigate GPIO false triggering by adding trigger duration or double-tap options

arrgj commented 2 months ago

PROBLEM DESCRIPTION

A clear and concise description of what the problem is. Several users have reported issues with false GPIO triggering. For me, any large inductive loads (motors, fans, etc) generate a false trigger.

Failed Hardware workarounds: -Shortening wires -Adding a resistor -Trying a different model ESP32 -Powering off a UPS / PC USB / separate circuit in the building. -Disabling Wifi and using Ethernet only -Ferrite bead on the power supply

Successful hardware workarounds: -Powering off of a battery, or a laptop USB on battery

I can't use a battery long term, so adding a capacitor will be the last test -- although, online research found that didn't work for people in similar situations. Others couldn't find the hardware problem and speculated having bluetooth enabled may be the cause.

To avoid headaches for everyone, would it be possible to add a software fix? Either: A trigger duration option (e.g one second minimum), or a trigger on double-click option? Or maybe even enabling the ESP32 glitch filter ? Sorry if that's an big ask, I'm still learning about micro-controllers

That could fix the sensitivity issue for everyone. The magic pixies that live in my cables seem to just send out one pulse, and it doesn't last longer than a fraction of a second.

REQUESTED INFORMATION

Make sure your have performed every step and checked the applicable boxes before submitting your issue. Thank you!

[ ] Read the Contributing Guide and Policy and the Code of Conduct
[ ] Searched the problem in issues
[ ] Searched the problem in discussions
[ ] Searched the problem in the docs
[ ] Searched the problem in the chat
[ ] Device used (e.g., ESP32-S3): _____
[ ] Nuki Hub binary firmware version number used: _____
- [ ] Pre-compiled
- [ ] Self-compiled
[ ] Provide the output of http://nukihubIP/info:



### TO REPRODUCE
_Steps to reproduce the behavior:_

### EXPECTED BEHAVIOUR
_A clear and concise description of what you expected to happen._

### SCREENSHOTS
_If applicable, add screenshots to help explain your problem._

### ADDITIONAL CONTEXT
_Add any other context about the problem here._

**(Please, remember to close the issue when the problem has been addressed)**

technyon commented 1 month ago

Hi. Sorry for the late answer. What do you mean by double-click option?

arrgj commented 1 month ago

Rather than trigger on input, trigger on two consecutive inputs within 1s of eachother. Say if you put a button on that GPIO as the trigger, double click that button to unlatch for example. If it's a hall sensor on the GPIO, bring a magnet close twice in quick succession. Etc. False trigger glitches seem to manifest as a single blip lasting milliseconds. If a double-tap trigger isn't feasible, "unlatch (or whatever) on trigger duration of minimum 1 second" would be a great workaround too

labmaster commented 1 month ago

Sorry but for me as hardware guy this seems to be nothing that should be fixed by software anyway. If there are glitches effecting the GPIOs of a Circuit, than the input circuit before the GPIO has a problem itself.

Maybe you better should show us how you tried to connect things to the GPIO. For general Smarthome I/O purpose i would anyway highly sugest to use optocouplers on the input, so that over voltages and other ground systems could be easily hooked up.

Anyway if there is anything implemented in software it should be only an option and needs to fully removable for all that need a clean and not filtered input interface.

technyon commented 1 month ago

@labmaster I'm looking into it, but I agree. I can add some workarounds in software, but that's more like putting a band aid rather than fixing the problem itself.

arrgj commented 1 month ago

@labmaster Not a hardware expert, but the setup is dead simple.

A WT32-ETH01
An Adafruit 375 Magnetic contact switch (door sensor/reed switch)

Wiring: One end to Ground, the other to any standard GPIO, avoiding special pins. As short as possible. That's it.

Also tried: -External pull up resistors (1k-10k). No change. -Replacing the WT32-ETH01 with an m5stack Atom or ESP32 WROOM 32D). The Ethernet cable isn't the problem. -Powering off of a 220v-110v step up/step down transformer -Plugging into a surge-protected port on a UPS

Will try optos next, thanks. Any particular model to request from my local chip-monger?

Found another workaround: using the same GPIO as General Input with the internal Pull Up resistor -- plus Home Assistant triggering on a toggle of minimum of 1s duration. Ideally Home Assistant shouldn't need to be involved.

And sure, it's a software fix for a hardware issue. But it could save the next guy a silly amount of time trying to find it. I wasn't the first, and won't be the last.

labmaster commented 1 month ago

Just to make thing clearer.

You use a WT32-ETH01 as ESP32 Board for installing the "Nuki Hub" ? At the WT32-ETH01 you want to hook up a reed contact like the "Adafruit 375 contact" ?

Reed contacts are a different kind of beast. 1) as long as you run short wires between contact and Input pin only a PullupResistor is needed to get a valid electrical signal (at 3.3V i would go for about 4,7KOhm) 2) a electrical signal is not the full solution, as mechanical contact (as reed are), has some king of bouncing when they change state. Bouncing can be eleminated be external circuits e.g. by are RC combination of components (Resistor/Capacitor) and maybe followed by a high impendance driver (to keep the R value high and the C value small). But bounce filtering could even be made in Software and is (it it is done right) much better and easier than in software. The trick in Software it, not to user IRQ's who could give a very precise time measuremnt but may fire very often during a contact bounce and so wast time and resources. For such cases i would highly sugest to implemt the input sampling by cyclic pulling and making the sampling rate adustable (as some kind of filtering)

So in this case you a right with better implementing such things in software. (saves hardware and is much easier adjustable)

If you plan to have longer cable runs to your reed contac then i would additional go for optocoupling, this would make it possible to better adjust the voltage (the contact is working with) and the current is usualy higher (LED current of the optocoupler) As sideeffect the electronic are better protected from high voltage glitches that may be induced from external sources (parallel AC cable runs, Neon Light, ...)

technyon commented 1 month ago

Hi,

There's already a very simple check for deboucing: After triggering an action is triggered via GPIO, further input is ignored for I think a second or so. For Nuki Hub this is no problem, as the actions take longer than a second to execute anyway.

I think though the problem is a different one, it's spurious triggering of actions without the circuit being closed. Having a minimum time that the circuit is closed could mitigate this to some degree. I'm afraid though this could lead to a situation where it works most of the time, but in rare case it's triggered unintentionally anyway. Not a good thing to have, especially when it's about safety.

The ESP is indeed configured to use the internal pull-up resistor.

labmaster commented 1 month ago

"...it's spurious triggering of actions without the circuit being closed..."

That would mean the needed current is way to small and that would lead to some kind of addition input circuits in the usual easiest way using a defined pullup resistor. Internal Pullup resistors of MCUs are usually very high (>50k) and so only could draw a couple of microamps, this is ok for typical onboard signaling (OC style) but often much to high resistance for external signal to be safe against unwanted influences. In the typical Use-cases where i would see the Nuki Hub to be implemented (usually controlled be Access systems, Door Contacts, Buttons... ) i always would go the road with an optocoupler.

And yes, if there should be some signal conditioning in software, a single (one time) inputsample should not be enough to change input state.

technyon commented 1 month ago

@labmaster I'm more a software guy, but I've worked with quite a few hardware developers and know you approach things like this a bit differently. Do you have a recommendation how to implement this? My code is interrupt driven, and can be configured for rising flank, falling flank, or both. I'd really like to avoid polling, since NUKI Hubs main functionality is driving BLE, Networking and the logic in-between, and polling wastes CPU time dedicated to this.

labmaster commented 1 month ago

Its not about the polling itself but more about how often a ISR gets fired without being necessary. So if you use ISR and really deactivate this specific ISR for a defined time (lets say this second that you mentioned) then this would work similar as polling every second. What i want to say, it matters if you only ignore the results of the ISR for one second or if you DEACTIVATE the ISR for that time. A undefined mechanical contact bounce could easily fire hundreds of ISR within a ms.

Then for debounce with ISR one way to go would to check the state of the former ISR fired input after 1 second (so before you activate ISR again) to have still the same state (by polling once) and then to update a internal representation of the pin state. But his way you would have to mix ISR and polling anyway.

This is why i find it easier to check everything by polling. Lets say, you create a timer (could be ISR driven) that fires every 30ms (a good value for debouncing) and every 30ms you poll the input, then you could easily check for 2 (or more if configurable) consecutive states to be the same and only then update a internal representation of the pin state. This would be non blocking an sampling a couple of inputs and check/set some states should be quite quick.

If you really fixed to fully ISR sampling only, you would have to implement filters with getting the time distance between consecutive ISR into account. Thats possible but would cost much more time overhead, especially as a lot of ISR gets fired for nothing. (as i explained above)

With my limited English i hope you understand what i try to explain ?

Sorry to bother you with one of-topic thing but it seems you have contributed quite a part to it. Its about: https://github.com/I-Connect/NukiBleEsp32/issues/76#issuecomment-2380026507

Do you use the actual main branch of NukiBleEsp32 in your "Nuki Hub" ? If so i am completely lost and do not understand why this would work. Maybe you could bring some light on this problem ? And yes, i could ignore it now as i found a work around (see my last post there), but it would be nice to know then why it would work in your case and not in mine. Thank you very much.

technyon commented 1 month ago

Ok, I got your point, I'll think about it. One thing though, an ISR driven implementation would add overhead only if the state changes, which doesn't happen very often, it's not like lock actions are triggered every second or even minute. A polling driven solution constantly adds overhead.

What do you think about something simple: Trigger ISR on change and check for a minimum hold time from rising to falling flank?

labmaster commented 1 month ago

"Trigger ISR on change and check for a minimum hold time from rising to falling flank?"

This would work if ISR events are somehow predicable but on such external events you have to take care then about other things. So you can not disable ISR for a certain time to overcome the bounce time. Like what happens if you get ISR faster (bouncing) than what could be handled by the ISR hardware and even software system. You have to be save than, that the system works even when you loose ISR events. I not already forced it on a ESP32 but have seen hardware ISR lockups (race conditions on ISR controllers at other systems) when trying to handle very fast external events.

On the other side, even if cyclic polling adds a little bit of overhead, the overhead is always predicable at every time and so very save. As realtime reaction ist not needed, you could easily expand the polling cycle interval as much as possible, lets say to 300ms, that would mean a poll run 3 times second.

technyon commented 1 month ago

OK, thanks a lot for your input. I'll see to rewrite the code to a polling based solution.

technyon commented 4 weeks ago

@arrgj Please give this build a try:

https://github.com/technyon/nuki_hub/actions/runs/11313314314

As suggested, it's now changed to polling. The polling frequency is 100 ms, and only after a low state has been detected three times, the action will be executed, so 300 ms in total.

arrgj commented 4 weeks ago

@arrgj Please give this build a try:

'tis a thing of beauty! No false triggers!!!

technyon commented 3 weeks ago

I've merged this back to master. I'd close this for now, if this needs further fixes like a configurable hold-time we can reopen the issue.

technyon / nuki_hub

[Feature] Mitigate GPIO false triggering by adding trigger duration or double-tap options #471

PROBLEM DESCRIPTION

REQUESTED INFORMATION