Closed BennyM closed 8 years ago
@BennyM Thanks for reporting and using our service. One of the reasons I know where can happen is if your connection is being throttled. These errors are usually retried i.e. the spout task will re-establish the connection if its dropped. Did you see that happen?
The error message should certainly be improved to make that clear (if that's the case). I know that it is much more clearer in the C# client for EventHubs.
http://azure.microsoft.com/en-us/pricing/details/event-hubs/ Check out the FAQ section in above article on how the throttling is enforced.
EventHubs has a concept of throughput units that you can find under the "scale" tab in your namespace in Azure Portal. Each unit means 1 MB/s ingress, 2MB/s egress.
This bandwidth is shared across your entire namespace not a single EventHubs. So if you have multiple services underneath this namespace, they will share this throughput limit.
You can go increase upto 20 throughput units in the Azure Portal, which gives you 20x throughput across your name space. In a namespace with a single EvenHubs of 8 partitions, one should be able to get roughly 2.5 MB/s ingress, 5 MB/s egress in a partition. A single partition cannot go beyond 5 MB/s.
As the number of partitions cannot be changed once an EventHubs is created its best to create the partitions based on categorization of your data. The scaling should be handled via the throughput units and increasing them as your application scales.
On contacting Azure support the number of partitions and the throughput (in blocks of 20) can be increased to larger numbers like 128 partitions and 100 units if you have higher needs than usual.
Should you be interested, EventHubs can also provide throughput upto 1 GB/s through enterprise contracts.
Hope this helps, let me know if you have follow-up questions in this regard. I will also bring this in notice to the EventHubs team and create a wiki around it.
You should start by taking a look at your EventHubs dashboard in Azure Portal. It should give you an statistical idea of how the your EventHubs are doing in the past hour.
@BennyM seems to have solved this problem by figuring this out himself after he opened the issue. Take a look this this blog post by @BennyM that talks more about the throughput units of EventHubs: http://blog.bennymichielsen.be/2015/08/11/scaling-an-azure-event-hub-throughput-units/
I will also suggest visitors at read about EventHubs performance in this blog by @shanyu: http://blogs.msdn.com/b/shanyu/archive/2015/05/14/performance-tuning-for-hdinsight-storm-and-microsoft-azure-eventhubs.aspx
I am leaving this issue open for now if something can be improved in this regard, among several options like:
If there are issues with connections, we recommend to use newer storm-eventhubs from the hdinsight repository for now. These changes will make it to Apache Storm as well.
A new client for azure-event-hubs is also available now, which should hopefully address connectivity issues better. Right now, I have opened this issue in storm-eventhubs to track that ask: https://github.com/hdinsight/storm-eventhubs/issues/11
Lately I've been having a lot of issues running topologies. Two errors occur quite often, giving no indication as to what's wrong.