YannickRe / azuredevops-buildagents

Generate self-hosted build agents for Azure DevOps, just like Microsoft does.
MIT License
151 stars 90 forks source link

Updating the virtual machines consistently fails #36

Closed Arash-Sabet closed 1 year ago

Arash-Sabet commented 1 year ago

This issue has become a roadblock to proceed with creating our build agent. The screenshot below depicts the stage of the failure:

image

The error message reads as below:

2023-03-01T20:43:05.2351811Z ##[section]Starting: Update Virtual Machine Scale Set 2023-03-01T20:43:05.2485347Z ============================================================================== 2023-03-01T20:43:05.2485986Z Task : PowerShell 2023-03-01T20:43:05.2486263Z Description : Run a PowerShell script on Linux, macOS, or Windows 2023-03-01T20:43:05.2486594Z Version : 2.212.0 2023-03-01T20:43:05.2486850Z Author : Microsoft Corporation 2023-03-01T20:43:05.2487120Z Help : https://docs.microsoft.com/azure/devops/pipelines/tasks/utility/powershell 2023-03-01T20:43:05.2487507Z ============================================================================== 2023-03-01T20:43:06.4775122Z Generating script. 2023-03-01T20:43:06.4834636Z Formatted command: . 'C:\a_w\110\s\azuredevops-buildagents\scripts\update-vmss.ps1' -ClientId MASKED -ClientSecret MASKED -ResourceGroup My-Build-Agents -SubscriptionId MASKED -TenantId MASKED -VmssNames ubuntu2204buildagents -ManagedImageId "/subscriptions/MASKED/resourceGroups/My-Configuration/providers/Microsoft.Compute/images/ubuntu2204-29342" 2023-03-01T20:43:06.5261790Z ========================== Starting Command Output =========================== 2023-03-01T20:43:06.5515602Z ##[command]"C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe" -NoLogo -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -Command ". 'C:\a_w_temp\MASKED.ps1'" 2023-03-01T20:43:15.3183530Z ERROR: The Image type for a Virtual Machine Scale Set may not be changed. 2023-03-01T20:43:15.5248878Z Updated Virtual Machine Scale Set ManagedImageId: ubuntu2204buildagents - /subscriptions/MASKED/resourceGroups/My-Configuration/providers/Microsoft.Compute/images/ubuntu2204-29342 2023-03-01T20:43:15.7116832Z ##[error]PowerShell exited with code '1'. 2023-03-01T20:43:15.7536085Z ##[section]Finishing: Update Virtual Machine Scale Set

I just used the word MASKED to hide the actual values.

@YannickRe Could you please chime in and let me know how to solve this issue? What causes to fail to update the VM scale set? Thanks.

SamBonavika commented 1 year ago

I ran into this issue because I created the VMSS instance using a "standard" Win2022 image. Per this article and the links listed within there are certain limitations about what can be modified after the VMSS has been created.

The only solution I found was to use the modified version of this code base put together by Erik de Bont but saving the created image to an Azure Compute Gallery and then creating a new VMSS using that image.

There's a major gotcha with Erik's code:

(I couldn't figure out how to create an "issue" in Erik's fork, so I figured I'd wait until Yannick accepted his PR and then create the issue in this repo.)

Once I had the image in the Gallery I used an az vmss create operation to spin up a new VMSS referencing the Gallery image. I have not had a chance to verify whether updating a new version of the image to the VMSS will work, but I've got my fingers crossed! ;-}

YannickRe commented 1 year ago

With my codes, the resulting Managed Image stays available. Destroy your VMSS, create a new one referencing the Managed Image that you just created and next run will complete successfully. THis has been a known issue for a while: chicken and egg > you need your own Agent to run for this long, but you don't have the image yet so the VMSS can't be updated with your generated image. Erik's fork will fix this but I can't get to the PR review atm :(

Arash-Sabet commented 1 year ago

@YannickRe The VMSS instance that I have had is Ubuntu 2022 LTS Gen2. The VM size is per Microsoft's recommendation. I don't see a reason why it should fail! What am I supposed to do?

Arash-Sabet commented 1 year ago

@YannickRe just an update: I realized that the managed image created by your pipeline is per the following screenshot:

image

Does it imply that I should destroy the VM set and recreate it as a Ubuntu 2022 LTS Gen 1? When will your pipeline start supporting Gen 2?

YannickRe commented 1 year ago

You should recreate the VMSS but use the generated Managed Image. When you use one from MS, you can't replace/update it afterwards with the generated image. Create a new VMSS, use the generated image, run the pipeline again and it'll work from now on to update the VMSS with the newly generated image.

Arash-Sabet commented 1 year ago

@YannickRe Unfortunately the update failed despite I created a new VMSS using the generated image! Not sure what else to look into to solve this problem.

Arash-Sabet commented 1 year ago

@YannickRe It finally worked. The reason for the last failure was a wrong variable name. Thanks.