mwasilak / txThings

CoAP library for Twisted
94 stars 43 forks source link

Handling of non-piggy-back replies #18

Closed sjlongland closed 6 years ago

sjlongland commented 6 years ago

Hi,

I'm trying to integrate txThings into a Tornado application that I'm working on in order to communicate with some devices running OpenThread. For some requests, the devices need to asynchronously communicate with a Modbus energy meter, and so they send an empty ACK to say, "Got that, please wait"… go communicate with the meter, then send the final ACK when they get the reply.

This works fine with Copper on Firefox, but I note I'm not seeing the data in txThings. Instead it seems to complain about a duplicate ACK and drop the message:

Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,541 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Sending message to fd7e:7b8c:6f0b:b77e::1000:1:5683
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,543 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Exchange added, Message ID: 28007.
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,544 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Message 'D\x02mg\x00\x00(l\xb0\x06modbus\xffp2,100,550,2' sent successfully
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,547 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Sending request - Token: 0000286c, Host: fd7e:7b8c:6f0b:b77e::1000:1, Port: 5683
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,592 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Received 'd\x00mg\x00\x00(l' from fd7e:7b8c:6f0b:b77e::1000:1:5683
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,595 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Incoming Message ID: 28007
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,597 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO New unique ACK or RST message received
Sep  8 16:44:32 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:32,599 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Exchange removed, Message ID: 28007.
Sep  8 16:44:34 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:34,650 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Received 'd\xa4mg\x00\x00(l\xc0\xfftimeout' from fd7e:7b8c:6f0b:b77e::1000:1:5683
Sep  8 16:44:34 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:34,652 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Incoming Message ID: 28007
Sep  8 16:44:34 wsg-74fe481fe117 docker_wideskyhub_drv_1/16faae79efb5[574]: 2017-09-08 06:44:34,653 vrt_mmda_wideskyhub.device.twisted[_legacy.py: 154] INFO Duplicate ACK or RST received

The upshot of this is I'm unable to see what the final response was. My understanding is (and this could be wrong) that when responding to a message, you don't change the token or message ID… and so the devices round-trip this unchanged.

Could you clarify the behaviour that's expected of txThings in relation to non-piggy-backed replies?

Also somewhat related: how do I specify a timeout? I've been looking for API documentation but haven't stumbled on any, nor have I found any way to set a time out on my CoAP requests.

sjlongland commented 6 years ago

So it looks like it's a firmware bug:

4.4.  Message Correlation

   An Acknowledgement or Reset message is related to a Confirmable
   message or Non-confirmable message by means of a Message ID along
   with additional address information of the corresponding endpoint.
   The Message ID is a 16-bit unsigned integer that is generated by the
   sender of a Confirmable or Non-confirmable message and included in
   the CoAP header.  The Message ID MUST be echoed in the
   Acknowledgement or Reset message by the recipient.

   The same Message ID MUST NOT be reused (in communicating with the
   same endpoint) within the EXCHANGE_LIFETIME (Section 4.8.2).

   Implementation Note:  Several implementation strategies can be
      employed for generating Message IDs.  In the simplest case, a CoAP
      endpoint generates Message IDs by keeping a single Message ID
      variable, which is changed each time a new Confirmable or Non-
      confirmable message is sent, regardless of the destination address
      or port.  Endpoints dealing with large numbers of transactions
      could keep multiple Message ID variables, for example, per prefix
      or destination address.  (Note that some receiving endpoints may
      not be able to distinguish unicast and multicast packets addressed
      to it, so endpoints generating Message IDs need to make sure these
      do not overlap.)  It is strongly recommended that the initial
      value of the variable (e.g., on startup) be randomized, in order
      to make successful off-path attacks on the protocol less likely.

   For an Acknowledgement or Reset message to match a Confirmable or
   Non-confirmable message, the Message ID and source endpoint of the
   Acknowledgement or Reset message MUST match the Message ID and
   destination endpoint of the Confirmable or Non-confirmable message.

That's the issue I'm running afoul of. My query regarding timeouts on requests still stands though.

mwasilak commented 6 years ago

Hello, I think the behavior of txThings on using different Message ID is correct - if I understand properly, the problem is caused by the fact that separate response is sent using ACK. This doesn't seem correct - ACK should only be used after receiving incoming CON. This is not the case with separate response.

Regarding timeout - it is not built into the library, but you can add it in your application. When you make a request you get a deferred d object. Using Twisted you can make a call to reactor.callLater(delay, handler, d). Inside the handler you can either cancel the request d.cancel() or fail it d.errback() (depending on your need). I'm not sure how to use it with Tornado though.

sjlongland commented 6 years ago

On 13/09/17 06:17, Maciej Wasilak wrote:

I think the behavior of txThings on using different Message ID is correct - if I understand properly, the problem is caused by the fact that separate response is sent using ACK. This doesn't seem correct - ACK should only be used for response.

Yeah, I later realised that and have fixed it to send the reply as a CON message. Technically I should re-send this until I get an ACK, but right now I haven't done this.

Regarding timeout - it is not built into the library, but you can add it in your application. When you make a request you get a deferred |d| object. Using Twisted you can make a call to |reactor.callLater(delay, handler, d)|. Inside the |handler| you can either cancel the request |d.cancel()| or fail it |d.errback()| (depending on your need). I'm not sure how to use it with Tornado though.

Ahh okay, reactor.callLater would work here since Tornado implements the reactor's interface. The Tornado way of doing that would be ioloop.addTimeout(ioloop.time() + delay, d.cancel).

I'll give that a try when I get to the office. -- Stuart Longland (aka Redhatter, VK4MSL)

I haven't lost my mind... ...it's backed up on a tape somewhere.

sjlongland commented 6 years ago

Okay, so the ioloop.addTimeout trick works… and as an advantage I can work to a deadline rather than waiting a timeout, which is actually how my code functions (an operation has a deadline by which it must be completed).

There is one caveat on that… if the deadline is in the past, addTimeout pretty much fires immediately, and txthings seems to continue retransmitting the request message indefinitely until such time as it thinks it should give up.

The following is working code… basically I need to check the deadline first and abort before I try passing it to txthings:

    @coroutine
    def _submit_rq(self, request, deadline=None):
        try:
            yield self._rq_sem.acquire()
            if (deadline is not None) and (deadline < self._io_loop.time()):
                raise IOError('Deadline expired for request %s' % request)

            self._LOG.debug('Submitting request: %s', request)
            deferred = self._protocol.request(request)

            if deadline is not None:
                timeout = self._io_loop.add_timeout(deadline, deferred.cancel)

            # Apparently this should work:
            # http://www.tornadoweb.org/en/stable/releases/v4.1.0.html#tornado-platform-twisted
            res = yield deferred
            if deadline is not None:
                self._io_loop.remove_timeout(timeout)

            raise Return(res)
        finally:
            self._rq_sem.release()
            self._LOG.debug('Finished with request: %s', request)