Optimize the extension to prevent partition rebalancing from occurring.

Azure / azure-functions-eventhubs-extension

Event Hubs extension for Azure Functions

MIT License

20 stars 26 forks source link

Optimize the extension to prevent partition rebalancing from occurring. #67

Closed soninaren closed 2 years ago

soninaren commented 4 years ago

As per the documentation on storage partitioning, The partition server where a blob would be processed is determined by the partition key generated for the blob.

Each blob has a partition key comprised of the full blob name (account+container+blob). The partition key is used to partition blob data into ranges. The ranges are then load-balanced across Blob storage. Range-based partitioning means that naming conventions that use lexical ordering (for example, mypayroll, myperformance, myemployees, etc.) or timestamps (log20160101, log20160102, log20160102, etc.) are more likely to result in the partitions being co-located on the same partition server.”

The container name for the event hub and function host start with the same prefix. So it is possible that all the blob operations are happening on the same partition server, increasing the chance of partition rebalancing operation.

dtangren2 commented 3 years ago

This is an important fix for us. This particular issue has impacted production reliability for an architecture we have that relies on EventHub. We raised a ticket with Microsoft support, and found through review and trial/error that this patch is our only viable solution, other than a potential replatforming away from EventHub. Hoping this change can be added to a release quickly.

cachai2 commented 2 years ago

@soninaren can you elaborate on the exact ask? We've improved checkpointing for the Event Hubs extension in version 5.0+. Can you please try this and see if it improves upon the required scenarios?

soninaren commented 2 years ago

@cachai2, I believe this came from a CRI. The function app had been subjected to several partition rebalancing process. Upon further investigation i found out that this was caused because of the way function app was generating the container + blob name.

This if further explained by the

The container name for the event hub and function host start with the same prefix. So it is possible that all the blob operations are happening on the same partition server, increasing the chance of partition rebalancing operation.

These are the only details i remember as this issue is quite old. I would suggest to further investigate and close this if this is believed to have been already addressed.

alrod commented 2 years ago

Close as no enough info.