nickandrew / LS30

Control software for LS-30 alarm system
11 stars 3 forks source link

Protocol errors #2

Closed nickandrew closed 8 years ago

nickandrew commented 14 years ago

The LS-30 protocol does not ensure that a response matches up with a request. I tried a query-all across a link with long (and varying) latency and I found that the client can give up on a response for an earlier query (A), and send later queries (B,C,D). When it receives the first response (for A) it may be waiting for a response for B, and so get out of step.

Part of the problem is on the client end - the client treats the request/response pair as a unit of work and it has to have some kind of timeout on reading the response. I could make the timeout bigger - but, how big? 60 seconds? 120 seconds?

I could also make the processing of responses asynchronous with the sending of requests. In other words, the client could send all of (A,B,C,D) without waiting for any response, and later when the responses come in, they are processed in that order. Also if we miss a response, say we send (A,B,C,D) and we receive (A,C,D) then the response 'C' is not matched to the request 'B' and so B would be marked as "no response" and the response would be correctly matched to 'C'.

nickandrew commented 9 years ago

I'm working on this. Now, LS30::Commander queues commands and serialises them. It won't let command B be sent until it has received a response for command A.

The other necessary change (I'll add another issue for it) is to ensure that the multiplexing daemon sends responses only to the client that sent the request.

Most commands should have a 5-second response timeout (5 seconds primarily for network latency; it could be as little as 1 second otherwise). But the learn commands I think need a much longer timeout, like 60 seconds. This should be automatically chosen for those commands. Also, I'm unsure if any commands exist which don't elicit a response.

nickandrew commented 8 years ago

Should be done.