dotnet / dnceng

.NET Engineering Services
MIT License
23 stars 16 forks source link

Production - [Alerting] Autoscale: Minutes to scale-up from zero machine alert #3227

Open dotnet-eng-status[bot] opened 4 weeks ago

dotnet-eng-status[bot] commented 4 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

@dotnet/dnceng, @dotnet/prodconsvcs, please investigate

Automation information below, do not change Grafana-Automated-Alert-Id-54aa0d7e647e46ff9e880bf6ae532b99

Release Note Category

dotnet-eng-status[bot] commented 4 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 4 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 4 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 3 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:broken_heart: Metric state changed to alerting

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule

dotnet-eng-status[bot] commented 2 weeks ago

:green_heart: Metric state changed to ok

Scale up issue: A queue has been waiting for a machine to scale up for more than 45 minutes, there are no machines in this queue, which could cause a lot of work to get stuck.

Wiki link for investigation and mitigation steps here

Metric Graph

Go to rule