Azure / RDS-Templates

ARM Templates for Remote Desktop Services deployments
MIT License
477 stars 606 forks source link

LogicApp version of scaling script runs forever #337

Closed TerminallyChilI closed 4 years ago

TerminallyChilI commented 4 years ago

Hi,

After having no luck with the scheduled task powershell version of the scaling script I have tried the LogicApp version.

This version works much better, however I've come across 2 issues.

First off, the logic app "Run" never seems to end. I've had them running for about 10 minutes before manually cancelling them. The runbook completes successfully (for scale up jobs, more on that later), but the LogicApp "run" never completes.

Can I set a timeout value to cancel the job after a certain amount of time?

Secondly for scale-down jobs I am seeing the runbook run indefinitely. I checked the output and it is getting stuck on the "Counting current sessions on the host XYZ" step. See below output:

Starting WVD tenant hosts scale optimization: Current Date Time is: 02/04/2020 20:55:58 It is Off-peak hours Starting to scale down WVD session hosts ... Processing hostpool SOC-W10-HOSTS Checking session host: SOC-WVD-1.StateOwnersCorp.local of sessions:1 and status:Available Checking session host: SOC-WVD-0.StateOwnersCorp.local of sessions:4 and status:Available Counting the current sessions on the host SOC-WVD-1.StateOwnersCorp.local :1

This only happens if there are still users/session open on the session host. If the session host is completely empty the scale down runbook job will succeed. (But then problem one occurs where the logic app "Run" never completes)

viswanadham-k commented 4 years ago

Hi @TerminallyChilI Thank you for giving feedback. Please find below issue fixes Issue 1: By design Azure logic app unable to get the output from webhook and status should showing running. But trigger history status is succeeded and webhook trigger fired for your reference please find the below screenshot.

image

Issue 2: Actually one of the condition is blocking in off peak hours. I have updated the code changes. Please follow the below navigation your end for to change line of code and test it.

Azure Portal-->Home-->WVDAutoScaleResourceGroup-->WVDAutoScaleRunbook (WVDAutoScaleAutomationAccount/WVDAutoScaleRunbook)-->Click on edit and find the line number 1254 then replace with below code --> Click on Save --> Click on publish.

if ($SessionHostInfo.UpdateState -eq "Succeeded" -and $SessionHostInfo.Status -eq "NoHeartbeat") {

If you are still facing issues please let me know. And share with us your official mailing address then we will connect and assist you to resolve the issue.

Thanks,

Sawal72 commented 4 years ago

Hello, can you tell me more about issue 1. Is there a fix, for the runbooks whichs are keep running? image

ChristianMontoya commented 4 years ago

@viswanadham-k : can you comment further?

Digiroka commented 4 years ago

I have the same issue. 'Run History' shows all logicapp runs as 'running'. Automation account shows jobs as complete, as does runbook output.

Sawal72 commented 4 years ago

Digiroka has the same "problem". The Logicapp "Run History" shows all jobs as 'running'. The automation account shows all the jobs which has run as 'completed'. See the printscreen in my first post.

viswanadham-k commented 4 years ago

Hi @Digiroka , @Sawal72 Thank you for giving feedback. There is no fix now, issue is by design. We are unable to get output from webhook. Issue 1: By design Azure logic app unable to get the output from webhook and status should showing running. But trigger history status is succeeded and webhook trigger fired for your reference please find below screenshot.

image

Thanks,

TerminallyChilI commented 4 years ago

Hi @TerminallyChilI Thank you for giving feedback. Please find below issue fixes Issue 1: By design Azure logic app unable to get the output from webhook and status should showing running. But trigger history status is succeeded and webhook trigger fired for your reference please find the below screenshot.

image

Issue 2: Actually one of the condition is blocking in off peak hours. I have updated the code changes. Please follow the below navigation your end for to change line of code and test it.

Azure Portal-->Home-->WVDAutoScaleResourceGroup-->WVDAutoScaleRunbook (WVDAutoScaleAutomationAccount/WVDAutoScaleRunbook)-->Click on edit and find the line number 1254 then replace with below code --> Click on Save --> Click on publish.

if ($SessionHostInfo.UpdateState -eq "Succeeded" -and $SessionHostInfo.Status -eq "NoHeartbeat") {

If you are still facing issues please let me know. And share with us your official mailing address then we will connect and assist you to resolve the issue.

Thanks,

Hey @viswanadham-k,

Apologies for my delayed response. For me, line 1254 in the runbook is "$IsVMStopped = $true"

I noticed that line 1253 is "if ($RoleInstance.PowerState -eq "VM deallocated") {" It's like this across all of the tenants I'm running the logic app on.

Should I replace this line (1253) instead? image

I understand that there is a charge associated with RunBook time, we've had a few days where it ran indefinitely and incurred unexpected charges, so we are eager to resolve.

You have mentioned that the LogicApp will continue to run by design, that is OK but can you please confirm that this won't incur any charge (LogicApp indefinite runtime)?

Thanks for your help so far, much appreciated.

Thanks, Nick

TerminallyChilI commented 4 years ago

Hi @TerminallyChilI Thank you for giving feedback. Please find below issue fixes Issue 1: By design Azure logic app unable to get the output from webhook and status should showing running. But trigger history status is succeeded and webhook trigger fired for your reference please find the below screenshot. image Issue 2: Actually one of the condition is blocking in off peak hours. I have updated the code changes. Please follow the below navigation your end for to change line of code and test it. Azure Portal-->Home-->WVDAutoScaleResourceGroup-->WVDAutoScaleRunbook (WVDAutoScaleAutomationAccount/WVDAutoScaleRunbook)-->Click on edit and find the line number 1254 then replace with below code --> Click on Save --> Click on publish. if ($SessionHostInfo.UpdateState -eq "Succeeded" -and $SessionHostInfo.Status -eq "NoHeartbeat") { If you are still facing issues please let me know. And share with us your official mailing address then we will connect and assist you to resolve the issue. Thanks,

Hey @viswanadham-k,

Apologies for my delayed response. For me, line 1254 in the runbook is "$IsVMStopped = $true"

I noticed that line 1253 is "if ($RoleInstance.PowerState -eq "VM deallocated") {" It's like this across all of the tenants I'm running the logic app on.

Should I replace this line (1253) instead? image

I understand that there is a charge associated with RunBook time, we've had a few days where it ran indefinitely and incurred unexpected charges, so we are eager to resolve.

You have mentioned that the LogicApp will continue to run by design, that is OK but can you please confirm that this won't incur any charge (LogicApp indefinite runtime)?

Thanks for your help so far, much appreciated.

Thanks, Nick

@viswanadham-k Any update? Apologies for the hassle, I've found that my hostpool had no hosts accepting new sessions today and that the logicapp had the hung jobs yesterday.

I think they're related an am eager to resolve.

viswanadham-k commented 4 years ago

Hi @TerminallyChilI Please find inline answers. I noticed that line 1253 is "if ($RoleInstance.PowerState -eq "VM deallocated") {" Here the script will check VM is successfully stopped or not in azure. (Which is best approach to validate because some times azure vm's will stop partially not fully deallocated) If you don't need to validate the VM is stopped then comment lines 1250 to 1257.

I've found that my hostpool had no hosts accepting new sessions today Can you please confirm those session hosts are having allow new session false. Don't keep manually the allow new session value false when session hosts are in running state.

logicapp had the hung jobs yesterday Logic app will execute 24 hours based on recurrence time.

Thanks,

accelle17 commented 4 years ago

@viswanadham-k

logicapp had the hung jobs yesterday Logic app will execute 24 hours based on recurrence time.

This has been 24 hrs since it started but it is still running:

image

alaurie commented 4 years ago

Ok, I have found the resolution of this issue from the comments on the MS docs for the WVD scaling deployment. Replace the Webhook in the logic app designer with a straight http call with the same body as the webhook its replacing. These triggers succeed and the runbook works just fine as well.

And if you need to cancel all the currently over running log app runs. Use the below script as the cmdlets don't page properly and get all of the currently running jobs. https://github.com/Azure/logicapps/blob/master/scripts/cancel-all-runs/cancel-runs.ps1

gexamb commented 4 years ago

@alaurie Thanks! Good catch. Script worked like a charm too. I had 2 days worth of jobs running in queue.

Meyers93 commented 4 years ago

The change to just "HTTP" was not successful for me. Still long running jobs and the task to shut down a VM never ends...

rubenz91 commented 4 years ago

Ok, I have found the resolution of this issue from the comments on the MS docs for the WVD scaling deployment. Replace the Webhook in the logic app designer with a straight http call with the same body as the webhook its replacing. These triggers succeed and the runbook works just fine as well.

And if you need to cancel all the currently over running log app runs. Use the below script as the cmdlets don't page properly and get all of the currently running jobs. https://github.com/Azure/logicapps/blob/master/scripts/cancel-all-runs/cancel-runs.ps1

Did the script works for you? I havent found a way to put it to work, always failing.

nakranimohit0 commented 4 years ago

Closing this as a duplicate of #431

gjhardie commented 4 years ago

Ok, I have found the resolution of this issue from the comments on the MS docs for the WVD scaling deployment. Replace the Webhook in the logic app designer with a straight http call with the same body as the webhook its replacing. These triggers succeed and the runbook works just fine as well.

And if you need to cancel all the currently over running log app runs. Use the below script as the cmdlets don't page properly and get all of the currently running jobs. https://github.com/Azure/logicapps/blob/master/scripts/cancel-all-runs/cancel-runs.ps1

Excellent solution - many thanks... Please find attached the update ARM template azLogicAppCreation.json.zip for the Logic App to work with the createazurelogicapp.ps1. Obviously line 256 needs updating to reference this from wherever one decides to store the template.