Open scottbright22 opened 3 months ago
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @dpwatrous @wiboris.
I am also affected by this issue, but I am using the Python SDK to manage Batch resources.
I am creating autopools using a modified version of the pending tasks scaling formula from the Azure Batch docs:
startingNumberOfVMs = 1;
maxNumberofVMs = {max_nodes};
pendingTaskSamplePercent = $PendingTasks.GetSamplePercent(180 * TimeInterval_Second);
pendingTaskSamples = pendingTaskSamplePercent < 70 ? startingNumberOfVMs : avg($PendingTasks.GetSample(180 * TimeInterval_Second));
$slotsPerVM = {task_slots_per_node};
$TargetLowPriorityNodes=min(maxNumberofVMs, (pendingTaskSamples / $slotsPerVM) + 1);
$NodeDeallocationOption = taskcompletion;
This formula works with regular node pools in the same Batch account.
Autopools using this formula fail to scale beyond 1 node in response to the accumulation of pending tasks, however. I have attempted to trigger a pool resize event by using Azure Batch Explorer to modify the maxNumberofVMs
value in the scaling formula, but that results in the following vague Internal Error
message:
Request ID: f68d2884-2543-4cb1-9388-de5c7839d089
Similarly, using Azure Batch Explorer to modify the resize configuration to a fixed size results in the license-related
Request ID: 6917061e-0fa5-4595-b2c3-293466ee8387
In Diagnostic Logs I can see that the scaling formula is being evaluated. For example, here is the reported result in a PoolAutoScaleEvent
for a test autopool that uses the scaling formula above:
$TargetDedicatedNodes=0;
$TargetLowPriorityNodes=5;
$NodeDeallocationOption=taskcompletion;
$slotsPerVM=7;
maxNumberofVMs=200;
pendingTaskSamplePercent=100;
pendingTaskSamples=34.5;
startingNumberOfVMs=1
The $TargetLowPriorityNodes
field is set to 5, but the node pool is not resized to match this value. There are no PoolResizeStartEvent
s in the Diagnostic Logs related to these PoolAutoScaleEvent
s.
Is this issue related to this notice somehow?
Batch pools can currently be created using Marketplace VM images containing pre-installed graphics and rendering applications that have pay-for-use application licensing. These VM images and the pay-for-use licensing will not be available for use starting 29 February 2024.
FWIW, I have not encountered the scaling issue in regular node pools that use the same VM configuration:
Non-AutoPool jobs did not have the issue for us either. We opened an Azure issue and our agent tells us a fix is going in any day now...
Library name and version
Microsoft.Azure.Batch 16.2.0
Describe the bug
Pools Create programmatically from an AutoPool Job Fail to scale up and give the error: “Application license is blocked as the support for application license is retired as of 02/29/2024”
Pools not created as an autopool work as expected.
No licenses or applications are used by our application.
Both Fixed and AutoScale Pools behave the same.
Expected behavior
Pool Scales up and adds a node.
Actual behavior
Pool fails to add a node. When done in the portal gives "Application license is blocked as the support for application license is retired as of 02/29/2024"
Reproduction Steps
Environment
.Net 4.8 Visual Studio 22 (Version 17.10.3)