Open erryB opened 5 years ago
Hi @erryB,
thanks for the detailed overview of the proposed architecture. Reading through it, several questions occurred to me. I will focus on the telemetry direction for now.
WDYT?
HI @sophokles73
Thanks for your feedback, I think it leads to a very interesting discussion. Here I try to give a quick answer to your points.
The main reason why we started using a single instance of IoT Hub was related to the number of tenants: if that number becomes really big, than the management could be complicated. But if this is not the case, we can go for one IoT Hub per tenant.
I think for cost efficiency, it would still be desirable to be able to use a single instance for multiple tenants, e.g. for offering free plans.
This could be also handled through another adapter, which would be the easy way in terms of implementation, but would require an additional component to be added to the architecture.
FMPOV we would handle this in the HonoClient
component which is managing the connection to the AMQP Messaging Network and which is used by the adapters to forward messages downstream. I can imagine introducing a configuration property there which would allow us to select Azure Addressing Scheme instead of Hono Addressing Scheme for downstream messages.
If we go for one IoT Hub per Tenant, then we don’t need the Event Processor to perform the filtering action you mentioned
Obviously. however, if we do want to be able to share a single IoT Hub instance among multiple tenants then this would be the responsibility of the Event Processor component, right?
If consumers connect directly with the custom endpoint and, of course, they cannot be modified to communicate directly with the backend, then I agree we need to create also an AMQP Endpoint component for protocol and path translation.
I am not sure that I understand which endpoint you are referring to with the custom endpoint
. is this the AMQP endpoint provided by IoT Hub?
Thanks again for your interesting input and feedback @sophokles73
We need to take into account that a single instance of IoT Hub shared for multiple tenants would be good for extensibility and horizontal scale, while having multiple IoT Hubs, one per tenant, would be good in terms of vertical scale but would also increase the complexity of the solution. We can consider it as an iterative approach, thus our proposal is to focus on the first iteration, demonstrating how to connect Eclipse Hono to a single IoT Hub supporting multi-tenant, so that we have a simple solution available in short terms and might explore scalability issues afterwards.
In order to connect the device to IoT Hub we need to have a pluggable interface able to recognize what is the cloud service the device wants to connect to. To implement this, we should work on 2 main points: first of all, we need to create this interface and use dependency injection to get the correct information necessary to connect, and then we have to implement the concrete connection to IoT Hub. It makes sense to leverage the component you already use as interface, as you mentioned.
Regarding the Event Processor component and assuming we will implement the solution with a single instance of IoT Hub shared for multiple tenants, we need Event Processor to be able to filter the messages based on Tenant ID. Basically the Event Processor will be responsible for consuming messages from IoT Hub and pushing these messages towards the appropriate LOB Application. Since we need EP to push messages but we also want to keep compatibility with the standard behavior of LOB App, we also need to add a small component/extension between EP and LOB Application, in order to guarantee the same behavior as enMasse's AMQP Endpoint.
What do you think?
We can consider it as an iterative approach, thus our proposal is to focus on the first iteration, demonstrating how to connect Eclipse Hono to a single IoT Hub supporting multi-tenant, so that we have a simple solution available in short terms and might explore scalability issues afterwards.
Sounds good to me.
first of all, we need to create this interface and use dependency injection to get the correct information necessary to connect
We already use dependency injection for setting up the protocol adapters. We can use an additional configuration property to use the Azure specific client for using Azure IoT Hub as the AMQP 1.0 Messaging Network.
Basically the Event Processor will be responsible for consuming messages from IoT Hub and pushing these messages towards the appropriate LOB Application.
Understood.
Since we need EP to push messages but we also want to keep compatibility with the standard behavior of LOB App, we also need to add a small component/extension between EP and LOB Application, in order to guarantee the same behavior as enMasse's AMQP Endpoint.
Indeed, the LOB Applications connect to the AMQP Messaging Endpoint in order to consume messages, so either the EP itself (or the small component between EP and LOB app) will need to expose an AMQP 1.0 endpoint to which the apps can connect. However, I wonder why we would want to add an extra component there and not just let the EP expose that endpoint.
I do have a more generic question as well: is there a particular (compelling) reason why we should use Azure IoT Hub instead of the generic Event Hub? Based on the documentation I found online, it doesn't look like there's is much difference when simply using them for downstream message forwarding, is there?
From a service provider perspective - offering Eclipse Hono as a managed cloud service - this proposal leads to two non-technical questions:
Eclipse Hono requires an AMQP 1.0 Messaging Network for exposing its remote service interfaces to business applications - not more and not less. Using Azure IoT Hub as such a messaging network feels somehow oversized due to the additional features it provides. Some are going in the direction of other Eclipse IoT projects (e.g. Device Twins and Eclipse Ditto). The open question for me is if there are any other features apart form the messaging aspect Eclipse Hono could benefit from (given the scope this project has)? If there are no other features I would like to better understand why the existing messaging services cannot be used, so I have the same question like @sophokles73.
Second topic is pricing: Azure Event Hubs Dedicated has a fixed entry price of ~ $5,000 and seams to have no restrictions regarding number of messages you can process (see https://azure.microsoft.com/en-us/pricing/details/event-hubs/). With Azure IoT Hub Standard Tier you get two S3 editions for the same price limiting you to 600,000,000 messages of 4 KB (see https://azure.microsoft.com/en-us/pricing/details/iot-hub/). For more you have to pay more. So looking at the price the first question becomes even more important as Azure Event Hubs seems to be the cheaper choice. Hint: Pricing is complex nowadays in the cloud so probably there is a need to look into some concrete scenarios.
I'm looking forward to your feedback regarding these questions.
@sophokles73 you are correct, Event Processor component could directly expose the AMQP 1.0 endpoint, but in that case it would be necessary to change the endpoint for each different platform (e.g. not all platforms will have or require an Event Processor Host). Having two different components allows us to isolate the endpoint as an extensible point and keep the Event Processor component simple, focused and efficient. These are the reasons behind our proposal, however we can still decide to develop a single component very tightened to Azure platform.
@sophokles73 @mhemmeter Regarding the differences between the IoT Hub and the Event Hub, I'll try to answer you both. Of course, there are many differences between the two of them, some more compelling than others in the scenario we are looking at here. The reason why we decided to leverage the IoT Hub was tied to the support for Direct Methods and Commands for the outbound connectivity. The other part that is beneficial is the device level authentication, but as mentioned there could be an Event Hub used here and in fact it was our first thought. Nevertheless command path was more appealing, without it there is a need to introduce other persistence points such as queues (e.g. Service Bus). Introduction of extra components as you point out means that there are considerations that have to be made, such as overall availability: the more components the more risk of a single component failing and reducing your availability expectations. The other element here is that we must consider all of the other factors around service limits and quotas to understand them in their entirety.
@mhemmeter The discussion about pricing can be complicated. As you point out, there are all different limits and pricing implications for each and every choice made. In the end with the architecture either Event Hubs or IoT Hub will require an Event Processor host to consume messages. The decision to use one or the other ties to the requirement to manage and host scalability in Hono or offload to Azure on the command side. When we think about a platform operator vs. a service provider vs. a solution builder they will all have different preference as it comes to pricing. Part of this gets into the operational vs. development vs. support costs. I agree with you that as a service provider the dedicated Event Hub option may be more attractive, but I guess we also need to take into account smaller solutions where it’s not possible to operate at 5K cost for a single part of the architecture.
Hi @erryB, thanks for your feedback. Let me first focus on the differences between IoT Hub and Event Hubs (second paragraph above).
I understand that the main driver for the proposal of using IoT Hub is the command path (outbound connectivity) and I fully agree to your argument that we should try to limit the number of dependencies to a minimum. However my understanding is that we need in this proposal for device to cloud communication IoT Hub and on top Service Bus for processing Events ("offline fallback"). So just looking at the dependencies we are talking about "Event Hubs plus Service Bus" versus "IoT Hub plus Service Bus", don't we? If this is true we anyway have Service Bus as a dependency and we could potentially use it also for command path.
Your second point of using device level authentication provided by IoT Hub is probably an interesting benefit. However this means that we need an implementation of the Credentials API in the system that makes use of this IoT Hub feature (see https://www.eclipse.org/hono/api/credentials-api/). Also we need an implementation of the mandatory operation of the Device Registry API (https://www.eclipse.org/hono/api/device-registration-api/, "Assert Device Registration") if we make active use of the IoT Hub identity registry. As you can see having the discussion in that direction leads to other questions compared to focusing on the messaging aspect we started with.
Coming back to pricing today: The good thing with Event Hubs is that not only the dedicated offering is available but also also the basic and standard tiers. So if you are running Eclipse Hono using Event Hubs as the messaging network in a cost sensitive context I think you can do so by choosing basic or standard tier.
Thanks @mhemmeter You are correct, we do need a Service Bus Queue for the offline fallback. However, if we decide not to leverage IoT Hub, we need to add not only another Service Bus Queue, but also a connection bounding mechanism, because EH has a maximum of 5K connections that can be established. Besides that, we need to handle Cloud to Device messages because we couldn’t use Direct Methods to do so, even though the concepts aligns pretty close to Hono requirements. Security and Device Identity are other elements we need to take into account. I also agree with you that the Protocol Adapters need some changes in order to create the IoT Hub Device ID, but some changes would be also necessary to handle Event Hubs partition key. Regarding the price, also IoT Hub has different tiers to choose, according to your needs, here you can take a look at the official documentation. As I mentioned though, we actually considered Event Hubs and of course it’s still possible to choose it. I'm just trying to consider all the features and not just costs, I believe we should take into account complexity and dependencies to select the best option.
Hi @erryB,
going through the documentation of the Event Hub, I wonder, if it is possible to use multiple topics with the Event Hub as you can with Apache Kafka. We could then use two topics per tenant (one for telemetry and one for events) in order to implement the downstream direction for multiple tenants using a single Event Hub instance.
WDYT?
Hi @sophokles73
The comparison of Kafka to Event Hubs features is described in this page. Basically Event Hubs provides an endpoint which can be used by Kafka applications. In the overall evaluation, though, we also need to take into account some limitations which can become relevant in multi-tenant scenarios, like 100 Even Hubs Namespaces per subscription. If we want to focus on Event Hubs, I think that, in terms of optimization, a single instance of EH per tenant with different partitions for Telemetry and Events could be a better option. However, we also need to keep in mind that with Event Hubs we need a custom implementation of Event Processor component and a separate mechanism to implement Command & Control messages.
Ok, I see. So each Event Hub instance within a Namespace corresponds to a Topic within a Kafka cluster. My understanding is that I can create multiple (how many?) Event Hubs per namespace and, as you indicated, a limited number of Namespaces per subscription. For the moment, I just want to make sure, that I understand all options that are on the table ...
Yes that's correct. You can create up to 10 Event Hubs per EH Namespace and up to 100 EH Namespaces per subscription. Here you can find all the details and other limits.
@erryB thanks for the info, Erica. This is very helpful for considering which approach could/will work for us.
AMQP Messaging Network proposal
The initial idea mentioned in the Issue 1120 was to leverage Azure Event Hubs, but in order to handle bidirectional communication and to avoid Partition Key complexity we believe that Azure IoT Hub is probably the best option.
Following Hono approach, we split the concept in two separate scenarios: Device to Cloud - Telemetry and Events and Cloud to Device - Command and Control.
In both cases, we want to underline that the proposed diagrams do not include any diagnostics/logging element at the moment. This is because the main goal of this post is to have a conversation about the main architecture, logging will be of course taken into account later during the implementation.
Any feedback about this architecture would be much appreciated, it would be very useful for us to understand if there is any concern we missed or any other interesting ideas to move forward.
Thanks!
Erica
Device to Cloud: Telemetry and Events
Here you can see a draft of the architecture for D2C data, we highlited in blue the components we believe should be added to have the best experience with Azure.
Data coming from the Device goes through Protocol Adapters and reaches Azure IoT Hub. Here an important point we need to take into account is related to the IoT Hub Device ID, which should be composed by an aggregation of Tenant ID and Hono Device ID, in order to be unique and to allow the messages to be addressed properly.
IoT Hub uses partitiones behind the scenes. The number of partitions have a direct impact on logical ordering, number of readers and of course on the overall performances. It's very important to remember that this number can be selected only at creation time, and it's not possibile to change it afterwards. Using IoT Hub allows to avoid the definition of the partition key and the construction of our IoT Hub Device ID guarantees that all the messages for a specific device will be in the same partition and will be processed in logial order.
The messages are then available on the Cloud in different IoT Hub Consumer Groups. In order to handle them properly and forward them to the correct LOB Application Consumer, we need to introduce a component here called Event Processor. The current idea in terms of implementation is to use Kubernetes StatefulSets and we need to guarantee that there is only 1 instance running to process a specific partition.
The main goals of this component are the following:
Cloud to Device - Command and Control
As you can see in the image below, the architecture is simpler because we do not need to provide offline support.
In this case the only component we need to leverage is Azure IoT Hub. Idea is to use Direct Methods, identifying the proper device using the IoT Hub Device ID, which is an aggregation of Tenant ID and Hono Device ID, as mentioned before. Here it's very important to define the payload of the messages to be sent to the devices, in order to unerstand how Direct Methods can be actually leveraged. For instance, it would be interesting to use some properties for diagnostic purposes.
Please let us know your thoughts and feedback