Status not read on state change

StickyDigit commented 3 years ago

Sorry to do this to you so soon after closing one. Hitting either lock or unlock in my ssh config as previously used, I see the "Unknown status" image. The other end is replying with the correct strings, in fact nothing has changed there since it was behaving right. Hitting the same action again gets the correct status shown, as does hitting the background to force a refresh, or closing and reopening, whereupon 'status' is invoked. Hitting lock from an unlocked state gets me 'unknown', hitting unlock from there also gets me 'unknown'. Seems to be ignoring responses on 'do stuff' commands, but only the first time each way.

mwarning commented 3 years ago

I suspect that the event is triggered even before the network connection is set up correctly. That is why I added a delay here, but it is probably not enough. Maybe change the value there to 3000 (ms) to be able to verifiy that it is indeed the problem.

StickyDigit commented 3 years ago

Will do when I get near my computer. Probably an hour or so.

Should it not be waiting for remote to close the connection (or fail/timeout), in which case the reply string (if any) should be caught?

StickyDigit commented 3 years ago

Right.. I'm about to try incrementing that delay, though I ought to say that as far as I can tell this has nothing to do with network state changes. It happens when changing lock state even with network up and stable.

StickyDigit commented 3 years ago

As expected. That delay is only affecting the time before it checks status after WiFi reconnects.

In case I wasn't quite clear, here's what's happening.

1: WiFi is connected. At start, status is more often than not discovered correctly, (on rare occasions it'll be 'unknown' after startup) 2: State at start is "UNLOCKED". The correct image is shown. 3: Pressing "lock", "Unkown" image is shown. At the server, all is well, and it has sent back "LOCKED". 4: Pressing "lock" again, or pressing the background a couple of times sorts it out, as does leaving and restarting. 5: Pressing unlock does the same. Shows "Unknown" image. unless further action is taken.

It appears that the app is not noticing the reply from the server when changing lock state as it used to.

I expect the same would happen on the 'ring', and if you like I'll modify the server script to accommodate that for tests.

StickyDigit commented 3 years ago

Watching both ends more closely, it seems that Trigger makes up it's mind what the status is long before it is returned. In my script, the lock and unlock are actually stopping and starting 'motion' using systemctl, which on a pi zero can take a couple of seconds. Getting status is also done via systemctl status motion, and has a delay too, but that is rarely if ever tripping up trigger.

I have set up a test version of the scripts that simply writes state to a file, avoiding the dealys introduced by actually doing anything, and it all works as expected, suggesting Trigger is timing out waiting for a reply rather than waiting for the other end to close.

Inserting a sleep 2 at the top of my script causes the issue to manifest, and also causes the 'status' call to show 'unknown', so the occasional 'unknown' from status up to now was likely timing sensitive too.

StickyDigit commented 3 years ago

Here's the hacked-up test script. I've commented out the 'sleep 2' at the top, and in this state it should play nice, Uncomment the sleep and you should see the issue. Sorry it's not pretty.. it's hacked up from the working script.

#!/bin/bash

#sleep 2

arg="$1"
[ -z "$arg" ] && arg="$SSH_ORIGINAL_COMMAND"

debug="1"

[ ! -z "$debug" ] && {
    now=$(date +%Y%m%d_%H%M%S.%N)
    logfile=/tmp/motionconlog
    echo "$now : invoked with '$arg'" >> "$logfile"
}

emit(){
    [ ! -z "$debug" ] && echo "$now : $*" >> "$logfile"
    echo "$*"
}

case $arg in
 start).
    echo "LOCKED" > /tmp/motioncon-state && e="LOCKED" || e="LOCKFAIL"
 ;;
 stop)
    echo "UNLOCKED" > /tmp/motioncon-state && e="UNLOCKED" || e="UNLOCKFAIL"
 ;;
 status)
    x="$(cat /tmp/motioncon-state | sed 's/[^A-Z]//g')" && {
      case $x in
        LOCKED|UNLOCKED) e="$x" ;;
        *) e="ODDNESS1" ;;
      esac
    } || e="ODDNESS2"
 ;;
 *) e="Invalid arg" ;;
esac
emit $e

mwarning commented 3 years ago

As expected. That delay is only affecting the time before it checks status after WiFi reconnects.

That is the case. ~~You could try to set both times here to 0.~~

~~If that does not help, then I have to check for more options.~~

The cause is that the read timeout is only 500ms. Please try to set it to maybe 5000ms here.

mwarning commented 3 years ago

I have changed it to 5 seconds already.

StickyDigit commented 3 years ago

Making them ten times longer didn't get it. Setting both to zero also fails when the action at the server takes too long. I found that setting the sleep in the script to more than 0.2 caused it to get skittish. Anything more than 0.3 would be a consistent fail.

StickyDigit commented 3 years ago

Ah.. right. Will try what you just did while the cables are warm :)

StickyDigit commented 3 years ago

Yeah.. that gets it for a delay of 4 seconds. Fails at 5 of course. Can I suggest that if this is the fix then a config option to set timeout per lock could be added to the wish-list with a default of 5.

StickyDigit commented 3 years ago

('cos I for one use ssh tunnels over slow links sometimes)

mwarning commented 3 years ago

Yeah, I will make it an option.

mwarning commented 3 years ago

It is implemented. Let me know if it works for you, then I can create a release.

StickyDigit commented 3 years ago

Yes. That's golden!

Thank you. Closing.

mwarning / trigger

Status not read on state change #57