MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.25k stars 21.4k forks source link

Should I care about oom_per_second on system pools? #86127

Closed josere closed 2 years ago

josere commented 2 years ago

When calculating the oom_per_second metric, should I care about system pools? Shouldn't that be Microsoft's job?

We probably want to filter out non user pools:

SELECT pool_id,
       name AS resource_pool_name,
       IIF(name LIKE 'SloSharedPool%' OR name LIKE 'UserPool%', 'user', 'system') AS resource_pool_type,
       SUM(CAST(delta_out_of_memory_count AS decimal))/ (SUM(duration_ms) / 1000.) AS oom_per_second
FROM sys.dm_resource_governor_resource_pools_history_ex
WHERE name LIKE 'SloSharedPool%' OR name LIKE 'UserPool%'
GROUP BY pool_id, name
ORDER BY pool_id;

Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

dimitri-furman commented 2 years ago

Hi @josere, yes, generally speaking Microsoft owns memory troubleshooting for system pools. However, there are some scenarios where you may want to be aware that a system pool is out of memory. In some cases, solving the OOM will require customer action. For example, the SloHkPool system pool manages memory for In-Memory OLTP (Hekaton). If this pool runs out of memory, it likely means that the size of your memory-optimized tables is too large for the compute size, and you will need to either scale up to delete some data. Similarly, if the InMemQueryStorePool is out of memory, you may need to change Query Store capture policy to capture fewer queries, or parameterize queries to reduce Query Store memory consumption, or scale up to have more memory in this pool.

That said, if you want to focus on the user pool only in your alerting, you can certainly modify the query as proposed (and remove the resource_pool_type column from the SELECT clause).

AnuragSharma-MSFT commented 2 years ago

@dimitri-furman Thank you for the detailed response on the issue.

@josere Please let us know if you have any further query.

AnuragSharma-MSFT commented 2 years ago

@josere Awaiting your response on this.