Azure / azure-iot-protocol-gateway

Azure IoT protocol gateway enables protocol translation for Azure IoT Hub
Other
225 stars 151 forks source link

Cloud-to-device Subscription Match CreationTime Check #70

Open robichaud opened 8 years ago

robichaud commented 8 years ago

@nayato @mtuchkov

The current implementation for checking if a subscription exists before publishing a C2D message is too restrictive:

bool TryMatchSubscription(string topicName, DateTime messageTime, out QualityOfService qos)
        {
            bool found = false;
            qos = QualityOfService.AtMostOnce;
            IReadOnlyList<ISubscription> subscriptions = this.sessionState.Subscriptions;
            for (int i = 0; i < subscriptions.Count; i++)
            {
                ISubscription subscription = subscriptions[i];
                if ((!found || subscription.QualityOfService > qos)
                    && subscription.CreationTime < messageTime
                    && Util.CheckTopicFilterMatch(topicName, subscription.TopicFilter))
                {
                    found = true;
                    qos = subscription.QualityOfService;
                    if (qos >= this.maxSupportedQosToClient)
                    {
                        qos = this.maxSupportedQosToClient;
                        break;
                    }
                }
            }
            return found;
        }

Specifically, the check on whether the subscription was created before the message was:

 && subscription.CreationTime < messageTime

The problem is that the Protocol Gateway resides on a different server than the IoT Hub. Any variances in the time between the two servers could cause any initial messages, after the device connects and subscribes, to be created "before" the subscription. The subscription CreationTime is the time on the Protocol Gateway, but the messageTime is the EnqueuedTimeUtc from the Microsoft.Azure.Devices.Client.Message which has the time from the IoT Hub.

I am seeing this behavior fairly regularly on a Protocol Gateway hosted on an Azure Cloud Service and can replicate it fairly easy:

  1. Device connects to Protocol Gateway (CONNECT/CONNACK)
  2. Device subscribes to topic
  3. Device receives SUBACK
  4. Device sends message
  5. Message is consumed on the cloud
  6. Cloud sends response to device
  7. Device does not receive the first message because the PG thinks the message was created before the subscription existed.
  8. Once enough time has passed to make up for the time variance, the device can receive C2D messages.

Before I made any code changes, I wanted to start a discussion about this.

It's impractical to require the client to wait an unknown period of time, after receiving the SUBACK message, before publishing and receiving messages.

Should it be the gateway's responsibility to enforce a subscription/message time check? I understand this is a workaround for the fact that the IoT Hub enqueues messages and has a minimum default message TTL of an hour. Perhaps it should be the responsibility of the sending service to set the ExpiryTimeUtc to a much shorter period of time instead?

It doesn't seem like there is a straightforward solution to this.

mastermanu commented 8 years ago

yep, we're keenly aware of this issue and are actively trying to come up with a good solution for this. External ideas are also welcome.

robichaud commented 8 years ago

@mastermanu I think, at least for now, we should add a threshold property to help reduce this restriction on the Protocol Gateway side. I have added a pull request for this.