MicrosoftDocs / azure-docs

Open source documentation of Microsoft Azure
https://docs.microsoft.com/azure
Creative Commons Attribution 4.0 International
10.25k stars 21.42k forks source link

Min and max resource percent seems cannot really control the CPU allocation. #82599

Closed psyblade12 closed 2 years ago

psyblade12 commented 3 years ago

In the document, it says that :

Workload groups provide a mechanism to define the min and max amount of resources that are allocated per request with the REQUEST_MIN_RESOURCE_GRANT_PERCENT and REQUEST_MAX_RESOURCE_GRANT_PERCENT parameters in the CREATE WORKLOAD GROUP syntax. Resources in this case are CPU and memory. Configuring these values dictates how much resources and what level of concurrency can be achieved on the system.

But actually, base on my test on DWU 100cc and DWU 500cc, the CPU utilization is a fix number regardless of the min/max resource grant percent number.

So am I doing something wrong, or it really that the workload management has no control on the CPU?


Added a picture that shows the test of min and max grant percent and the CPU usage. It appears that it has no use on the allocation of CPU. The CPU usage always sit at 70% regardless of the min/max resource grant percent number.

image


Document Details

Do not edit this section. It is required for docs.microsoft.com ➟ GitHub issue linking.

himanshusinha-msft commented 3 years ago

Thanks for the ask . At this time we are reviewing the ask and we will reply back as soon as we can .

ronortloff commented 2 years ago

@psyblade12 could you paste a picture of the workload management configuration so we can see the effective values of the configuration.

psyblade12 commented 2 years ago

Here is the current, but when I did the test above, I set the min and max resource per request to a fix number range from (5%-100%). And the cap resource to 100% percent. And in every case, the CPU usage are sit at 70%.

image

I can see in other places in the docs, the term "CPU" is used as "concurrency", so I suspect that in this case, the term CPU also means the "concurrency capacity", not the CPU clock/core allocation.

image

ronortloff commented 2 years ago

@SudhirRaparla, fyi.

Thanks for the details @psyblade12. CPU is governed at the workload group level, not the individual request level. We will update documentation accordingly. For example: Effective Cap Resources % set to 75% and REQUEST_MIN_RESOURCE_GRANT_PERCENT set to 25% means a query can address 25% memory and 75% CPU. 3 requests in this workload group would share 75% CPU and each get 25% memory dedicated to their query.

If you'd like to test this out, you can set the CAP for a workload group equal to the REQUEST_MIN_RESOURCE_GRANT_PERCENT. Not sure if this is practical for your scenario, but that will give you CPU governance based on what you've shown.

psyblade12 commented 2 years ago

@SudhirRaparla, fyi.

Thanks for the details @psyblade12. CPU is governed at the workload group level, not the individual request level. We will update documentation accordingly. For example: Effective Cap Resources % set to 75% and REQUEST_MIN_RESOURCE_GRANT_PERCENT set to 25% means a query can address 25% memory and 75% CPU. 3 requests in this workload group would share 75% CPU and each get 25% memory dedicated to their query.

If you'd like to test this out, you can set the CAP for a workload group equal to the REQUEST_MIN_RESOURCE_GRANT_PERCENT. Not sure if this is practical for your scenario, but that will give you CPU governance based on what you've shown.

Thank you for your reply I will test this case tomorrow when I will be at work.

I really hope that I can control the CPU utilization, not just the memory. The workload management is truly what I highly appreciate Synapse over Spark, but so far I still haven't managed to do that.

Even if it just works at workload group level, that is also enough.

psyblade12 commented 2 years ago

@SudhirRaparla, fyi.

Thanks for the details @psyblade12. CPU is governed at the workload group level, not the individual request level. We will update documentation accordingly. For example: Effective Cap Resources % set to 75% and REQUEST_MIN_RESOURCE_GRANT_PERCENT set to 25% means a query can address 25% memory and 75% CPU. 3 requests in this workload group would share 75% CPU and each get 25% memory dedicated to their query.

If you'd like to test this out, you can set the CAP for a workload group equal to the REQUEST_MIN_RESOURCE_GRANT_PERCENT. Not sure if this is practical for your scenario, but that will give you CPU governance based on what you've shown.

I just tested and it seems that it doesn't work like what you describe.

Here I have two test case, the first test case is that I set all the min, max and the cap to only 10%. image

But the CPU usage clearly show that the CPU usage is alround 70% image


The second test case is that I set all the min, max and the cap to 100%. image

But the CPU usage clearly show that the CPU usage is alround 70%

image

The number in resource_allocation_percentage here is still correct: 10% and 100% image

Also, I want to note that the real elapsed time for a same query with different grant percent is the same. My test query is an in-equi join, filter then count, which means this query is CPU oriented.

Am I doing anything wrong here?

ronortloff commented 2 years ago

Docs have been updated for this.

ronortloff commented 2 years ago

Sorry, I missed the last comment.... Can you confirm the timelines line up between the graphs and the DMV output.

psyblade12 commented 2 years ago

Sorry, I missed the last comment.... Can you confirm the timelines line up between the graphs and the DMV output.

Hey please don't close the issue. I am still unable to control the CPU allocation. It doesn't work like what you describe.

I don't really understand what do you mean by "confirm the timelines line up", but the DMV shows the 2 queries.

The query whose resource_allocation_percentage column in the DMV equals to 10% is the first test case in which I set all the min, max and the cap to only 10%. And you can see that the CPU usage is still 70%. The query whose resource_allocation_percentage column equals to 100% is the secondtest case in which I set all the min, max and the cap to 100%. And you can see that the CPU usage is still 70%.

The actual elapsed time for the same query in both 2 test case is the same.

I don't know if it is just me or not. But we have two seperate instances of SQL pool and both have the same behavior: CPU usage is un-controllable.

What also surprises me is that if I use the smallrc or other system defined one. I can really ramp up the CPU usage to 100. But if I use user defined ones, the CPU allocation is only around 70%. The smallrc really has a cap resource of 100%. I suspect that you described behavior only works with the system defined ones. But the problem is that the system defines one only have cap resources of 100%. So I can't test the lower case with system defined ones.

ronortloff commented 2 years ago

Regarding: "timelines line up between the graphs and the DMV output"

You shared DMV output for sys.dm_pdw_exec_requests which showed resource_allocation_percentage, but no time to correlate to the portal graph of CPU usage. I cannot tell if the DMV output is collected at the same time as the portal CPU usage. That is what I meant. Please share start/end time for the DMV and a zoomed in window, with time, for Portal reported CPU usage.

Please make sure nothing else is running. If you're able to run the test query in a loop to get more than a few samples, that will help too.

WilliamDAssafMSFT commented 2 years ago

please-open

ronortloff commented 2 years ago

Hi @psyblade12, this issue open. please see my previous comment.

psyblade12 commented 2 years ago

Hi @psyblade12, this issue open. please see my previous comment.

Then I will try remaking my test again. This time I will post a picture of DMV result with start end end time

Here, I have the first test which all of my min, max and cap is set at 10%

image

But the CPU is around 60% image

The DMV shows 10% resource allocation percentage. But the actual CPU is much higher than that. Note that the start time and end time seems to be UTC. The chart above is GMT + 7

image

Now, the second test, I set all the min max and the cap to 100%

image

And CPU usage is still 60% image

Here is the DMV: image

Note: as you can see in the DMV, there is only 1 query is running.

ronortloff commented 2 years ago

Dug a little deeper on this... We only limit CPU utilization under contention. Since you're running this in isolation, it will be able to use the available CPU. Under contention, the max it can use is 10%.

psyblade12 commented 2 years ago

Dug a little deeper on this... We only limit CPU utilization under contention. Since you're running this in isolation, it will be able to use the available CPU. Under contention, the max it can use is 10%.

Do you mean "contention" is that if we 2 two concurrent queries from different workload. For example the first workload has 90% resource allocation and the second has 10% allocation, the query which is run in workload 1 will have around 9 times faster than the same query run in second workload?

I remembered that I have tested this cases before, but what I found is that 2 two same queries runnning in different workload group have a same elapsed time. (So it means that the CPU time is equally shared, and doesn't seem like it is allocated by defined rules, for example 9-1) I will re-test this case when I have time. It is hard for me to find time to test the system that will not mess with other colleagues.

What also confuses me is that, what is the exact number of CPU percentage a single query can take. From the test I can see that, a single query can only get 60-70% percent of CPU. Why not a higher number, or why not a lower number. Is there an exact rule in that? The 100% percent of CPU can only be achieved if we run more than 2 queries concurrently. So for example, what if I want to concentrate all CPU effort on a single big big query, how can I do?

Please, write these behaviours of SQL Pool more clear in the document. It is really hard for me to explain and prove the problems to other colleages and higher level executives, because the things in the document doesn't match the reality. So I am usually put in situation of having try to prove that it is the system limitation, not my fault of doing things wrong...

ronortloff commented 2 years ago

There are a lot of factors that go into CPU consumption. It can depend on the query shape or system state. The system could be doing a lot of IO where CPU has to wait. Certain activities in SQL cannot run effectively in parallel. That can have an impact too. Backtracking a little... what scenario are you trying to validate? Are you trying to limit resource consumption? Are you trying to improve performance? Something else?

psyblade12 commented 2 years ago

There are a lot of factors that go into CPU consumption. It can depend on the query shape or system state. The system could be doing a lot of IO where CPU has to wait. Certain activities in SQL cannot run effectively in parallel. That can have an impact too. Backtracking a little... what scenario are you trying to validate? Are you trying to limit resource consumption? Are you trying to improve performance? Something else?

Yes I understand that, so when I design the test, I make the test as a simple inequi-join between 2 1-columned random number tables, and then count number of rows that can join. This test case, I believe is completely CPU oriented.

What I am trying to do is of course I want to control CPU allocation for queries. CPU should be allocated more to more important query, and less to unimportant queries. Or in a more easier term: more important query should run faster than less important query.

What I expect is that, if the same queries are running concurrent, the query which comes from more important workload group should get more CPU resource and run faster than than query coming from less important workload group. If the important query with 90% resource is 9 times bigger than the less important query with 10% of resource, the 2 concurrent queries should complete at the same time.

That is what I want. So if Synapse doc state it can do that, then it should do that. Either by hard limiting CPU resource for each query even if there is no other concurrent query (like what it is doing with RAM - putting min/max_grant_percent directly to the queries sent to worker nodes), or reallocating CPU to defined portion when there is a contention, or by any other mechanisms, it is still all OK, just that users will be much happier if the doc have clearer note about how it works.

Or if Synapse cannot (Like the previous resource class system, the doc clearly states it can only limit RAM, not CPU), then just write that to the doc too, so users can know and move on to other tasks.

ronortloff commented 2 years ago

Thanks for the feedback. We will get the docs updated to make this more clear. Couple things to note: CPU is a much more fluid resource than memory. Context switching and scheduling on CPUs is lightweight compared to moving data in and out of memory.
Workload groups provide a mechanism to contain cpu and dedicate memory to workloads (not apply importance to run-time resource allocation). Again, I'll make sure this is more clear in the doc. Side note: Workload Importance influences scheduling and access to locks.
Regarding your example with the "9 times bigger" query... if you configure the other workload group with a cap of 10% and got it busy with queries, the 9 times bigger query will get 90% (assuming it needs the CPU and not capped lower). Your experiment with a join may involve data movement (due distribution incompatible join). If that is the case, it has to do IO (network and disk) to move data. You could remove the join or make it distribution compatible (replicated table?) add a few more aggregates and see if that burns more CPU.

psyblade12 commented 2 years ago

I just tested and it seems that it really works under contention!

I run the 2 same queries, concurrently, one is with a 10% resouce and one is 90% resource. The query is CPU-oriented with un-equivalent join (that will surely trigger CPU-consuming nestedloop join algorithm whose time complexity is O(n^2)). All the tables used in the test contain only 1 column with random numbers, one of the tables is already replicated. That means the this query will never has problems with IO or data movement. The only thing that matter for this test query is CPU.

image

I can see that the one with "higher resource" (90% resource) is significantly faster than the one one without higher resource (only 10%). image

image

The reason why it is not 9 times faster is perhaps because, after the query with "higher_resouce" is done, the CPU is re-distributed to the remaining query. I can see it in the chart

image

The behavior is much more clearer for now! I have tested this case 5 times with some minor different behaviors, but all the results seem the same.

There are still things I want to ask:

If these above questions can be made clearer, I would highly appreciate.

Also, please update these behaviors and make it clearer in the doc, as detailed as possible. I extremely appreciate the workload managment feature of Synapse and I consider this as selling point of Synapse over Spark Databricks or something, whose isolation is usually achived through spawning completely separate clusters. But as the Synapse document wasn't so clear, I am really suffering to achive the wanted isolation.

ronortloff commented 2 years ago
ronortloff commented 2 years ago

doc is updated.

ronortloff commented 2 years ago

please-close