MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10k stars 21k forks source link

ER circuit metric data loss #122242

Closed yapfeng0719 closed 6 days ago

yapfeng0719 commented 1 week ago

There would be 2-3 mins metric data loss when backend ER metric instance are rebooted due to maintenance. This will trigger alert for users. Please consider to add certain logic to reduce metric loss time or smooth failover


Document Details

Do not edit this section. It is required for learn.microsoft.com ➟ GitHub issue linking.

TPavanBalaji commented 1 week ago

@yapfeng0719 Thanks for your feedback! We will investigate and update as appropriate.

ManoharLakkoju-MSFT commented 1 week ago

Hi @yapfeng0719 Thank you for bringing this to my attention. It is true that there can be a loss of metric data for 2-3 minutes when the backend ER metric instance is rebooted due to maintenance. This can trigger alerts for users. While there is no way to completely eliminate this loss of data, there are some steps you can take to reduce it. One option is to use multiple metric instances to ensure that there is always at least one instance available to collect data. Additionally, you can configure your alerts to have a delay before triggering to allow time for the metric instance to come back online. Another option is to use Azure Monitor to collect and store your ExpressRoute metrics, which can help to reduce the impact of any metric data loss during maintenance. I hope this helps! Let me know if you have any other questions.

ManoharLakkoju-MSFT commented 6 days ago

@yapfeng0719 We are going to close this thread as resolved but if there are any further questions regarding the documentation, please tag me in your reply and we will be happy to continue the conversation