microsoft / PartsUnlimited

.Net Core + SQL Azure app for DevOps Scenarios
https://microsoft.github.io/PartsUnlimited/
MIT License
667 stars 1.53k forks source link

Azure Automation Runbook Deployments ~ Please Help #183

Open HChil opened 5 years ago

HChil commented 5 years ago

Hi there! So, in DEVOPS200.2x, "Infrastructure as Code", Lab 1, Task 4, Step 7, we are instructed to start a runbook in the test pane and wait for the "completed" message to be displayed. However, when I attempt to do that, I get an error saying: "The runbook job start was attempted three times, but it failed to start each time". According to: https://docs.microsoft.com/en-us/azure/automation/troubleshoot/runbooks#job-attempted-3-times, this error can be fixed by: splitting the amount of memory used between 2 jobs, updating the modules, or using a "Hybrid Runbook Worker".

I have attempted these solutions to the best of my ability, but am still very much a novice. Since the assignment instructions do not cover how to fix these kinds of errors, I suspect that said errors are outside of the scope of the class. Does anyone know how to fix this error? Future assignments depend on completing this one first so I'm rather stuck until this gets figured out.

Thank you all! I'll keep researching and post solutions, if I find them.

HChil commented 5 years ago

I figured out a solution.

The problem: AzureRM 5.5.0 is structured in a way that seems to be depriciated, and not compatable with current versions of Azure. the newer versions are structured differently, but automatically call the most up to date versions of their modules. for the lab materials to work, AzureRM needs to be in the new structure but use old modules.

The solution: I automatically deployed the newest version of AzureRM (6.13.1), navigated to Modules, deleted the AzureRM.Network and AzureRM.profile that came with that version, and then manually added the modules: AzureRM.Network (5.3.0) and AzureRM.Profile (4.4.0). After running the runbook in the test pane, it managed to create all of the virtual resources it was supposed to. Kind of obvious, now that I think about it, but I didn't know it was doable until I spent today trying to figure it out,

I'm closing this issue, though the Lab owners might want to look into a better solution than "manually rollback specific sections of the update".

EDIT:

So I found this great workaround, managed to complete the first assignment, and was thrilled with my ingenuity. But the next assignment requires the environment created in the first assignment to be brought back into existence. I ran my "solution" but was met with the failures I had before that "solution" "worked" the first time.

In short: The error is still in play, I still am searching for an effective solution, and I could still use help.

Thank you

stuartleaver commented 5 years ago

@HChil I have just came across this same issue and have just been doing some debugging.

The first issue I had has was that I used a hyphen in my Resource Group name. Going by where it was falling over, I made the assumption that it was when the Storage Account was being created. So I tried to create one manually using the name concatenation of variables as the script uses and noticed that hyphens are not allowed. So I edited the variable in the Automation Account for the Resource Group.

On the next run, the Storage Account was created but I hit another error with the VhdUri when the Set-AzureRmVMOSDisk was being run. As the error was around the Blob storage name, I took a guess at the Uri length and so edited the four lines of code that get a random number to shorten the Uri. So...

$randomNumber1 = Get-Random -Minimum 0 -Maximum 99999999

became (in four places)

$randomNumber1 = Get-Random -Minimum 0 -Maximum 999

I also added -Force onto some commands to get rid of errors about not being able to ask for permission.

It then ran and created all the resources and allowed me to then complete the DSC lab.

If you didn't get any further, hopefully the above is of some help to you.