Azure / azure-rest-api-specs

The source for REST API specifications for Microsoft Azure.
MIT License
2.61k stars 5.03k forks source link

Cannot properly install Microsoft.Azure.ActiveDirectory.AADLoginForWindows extension on a VM #10234

Open ArcturusZhang opened 4 years ago

ArcturusZhang commented 4 years ago

Initially reported in https://github.com/terraform-providers/terraform-provider-azurerm/issues/7748

On portal, you can successfully install the extension as the following to a windows VM:

{
  "autoUpgradeMinorVersion": true,
  "forceUpdateTag": null,
  "id": "/subscriptions/...",
  "instanceView": null,
  "location": "westeurope",
  "name": "AADLoginForWindows",
  "protectedSettings": null,
  "provisioningState": "Succeeded",
  "publisher": "Microsoft.Azure.ActiveDirectory",
  "resourceGroup": "rg-d-we1-wherescape-poc",
  "settings": null,
  "tags": null,
  "type": "Microsoft.Compute/virtualMachines/extensions",
  "typeHandlerVersion": "0.3",
  "virtualMachineExtensionType": "AADLoginForWindows"
}

but the same configuration does not work on terraform (equivalent with pure REST API):

resource "azurerm_virtual_machine_extension" "aad" {
  name                       = "aad-${local.resource_name_suffix}"
  publisher                  = "Microsoft.Azure.ActiveDirectory"
  type                       = "AADLoginForWindows"
  type_handler_version       = "0.3"
  auto_upgrade_minor_version = true
  virtual_machine_id         = azurerm_windows_virtual_machine.scheduler.id

  tags = local.tags
}

and the service returns this error:

Error: Code="VMExtensionHandlerNonTransientError" Message="The handler for VM extension type 'Microsoft.Azure.ActiveDirectory.AADLoginForWindows' has reported terminal failure for VM extension 'aad-d-we1-wherescape-poc' with error message: 'Install failed for plugin (name: Microsoft.Azure.ActiveDirectory.AADLoginForWindows, version 0.4.1.1) with exception Command C:\\Packages\\Plugins\\Microsoft.Azure.ActiveDirectory.AADLoginForWindows\\0.4.1.1\\AADLoginForWindowsHandler.exe of Microsoft.Azure.ActiveDirectory.AADLoginForWindows has exited with Exit code: -2145648639'.\r\n    \r\n'Install handler failed for the extension. More information on troubleshooting is available at https://aka.ms/vmextensionwindowstroubleshoot'"

Terraform or REST API installed extension could be provisioned successfully when you change the typeHandlerVersion to 1.0, but it would not work properly.

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Drewm3, @axayjo.

Drewm3 commented 4 years ago

@amjads1, could you look into this issue?

magnus-longva-bouvet commented 4 years ago

I'm having a similar issue. I can install the extension via az vm extension set --publisher Microsoft.Azure.ActiveDirectory --name AADLoginForWindows on Azure CLI 2.9.0, but not through ARM template deployment. Upon ARM template deployment, I get the error

"message": "The handler for VM extension type 'Microsoft.Azure.ActiveDirectory.AADLoginForWindows' has reported terminal failure for VM extension 'Microsoft.Azure.ActiveDirectory.AADLoginForWindows' with error message: 'Install failed for plugin (name: Microsoft.Azure.ActiveDirectory.AADLoginForWindows, version 0.4.1.1) with exception Command C:\\Packages\\Plugins\\Microsoft.Azure.ActiveDirectory.AADLoginForWindows\\0.4.1.1\\AADLoginForWindowsHandler.exe of Microsoft.Azure.ActiveDirectory.AADLoginForWindows has exited with Exit code: -2145648639'.\r\n \r\n'Install handler failed for the extension. More information on troubleshooting is available at https://aka.ms/vmextensionwindowstroubleshoot'"

amjads1 commented 4 years ago

Looking into this issue and hoping to have an update asap.

amjads1 commented 4 years ago

I have reached out to the extension owners, awaiting on the response.

amjads1 commented 4 years ago

Adding @srrstepe who is working on this issue. Can you please share more details and the information you are looking from the user?

srrstepe commented 4 years ago

Currently there is an issue in Azure Device Registration service where the device registration requests would fail if the resource name is greater that 15 characters. This issue has been fixed and currently under deployment.

Looking a the resource group name “ rg-d-we1-wherescape-poc”, I am assuming the resource name would have been greater than 15 characters as well. In order to confirm if the issue being seen by @ArcturusZhang is same as the above mentioned issue we need the following logs: • Get public scripts here: https://1drv.ms/u/s!AkyTjQ17vtfagYkZ6VJzPg78e3o7PQ • RDP/login to a VM instance using the default admin user credentials. (User can enable the admin account on the VM from the portal -> vm -> operation -> run command -> Enable Admin Account) • Copy the public scripts downloaded from the above link • Open an admin command prompt and run start_ngc_tracing_public.cmd • Stop running the logging script by executing stop_ngc_tracing_public.cmd • Please zip and send us the logs under %SYSTEMDRIVE%\TraceDJPP* for analysis. • Also please collect the logs in the following folders: o %systemDrive%\WindowsAzure\Logs o %systemDrive%\WindowsAzure\CollectGuestLogsTemp

amjads1 commented 4 years ago

Thanks for the update @srrstepe !

@ArcturusZhang - Can you please help us in getting the above requested details so that we can confirm and close on this issue asap?

ArcturusZhang commented 4 years ago

Hi @amjads1 and @srrstepe thanks for the investigation! But I am only the one to migrate this issue here not the issue author. I have informed the actual issue author to make some comments on your information.

amjads1 commented 4 years ago

Thanks for the response @ArcturusZhang . Please update with user comments on the requested information.

hugo-paredes commented 4 years ago

Hi @ArcturusZhang and @amjads1

I've followed the steps indicated in here and attached the logs.

Hope it helps. Hugo

CollectGuestLogsTemp.zip Logs.zip TraceDJPP.zip

amjads1 commented 4 years ago

Thanks for sharing the logs @hugo-paredes !

@srrstepe - Can you please look into the logs and confirm if the reported issue is same as the known issue where ADRS is making a change to allow VM resource names up to 64 chars – up from 15?

amjads1 commented 4 years ago

@srrstepe - Can you confirm on the issue and status based on the logs provided by @hugo-paredes ?

srrstepe commented 4 years ago

I looked at the logs provided by @hugo-paredes . I could confirm that the issue they were seeing is infact the issue mentioned above. i.e. Resource name greater than 15 characters. Azure resource Id:/subscriptions/9faac4cb-9dc7-4672-85f6-44027ead7635/resourceGroups/rg-l-we1-fakegdp-main/providers/Microsoft.Compute/virtualMachines/vm-l-we1-fakegdp-op-vm

This issue has been fixed by ADRS team. I see from the logs that user was indeed able to successfully install the extension on the VM and the device is AAD joined. Handler Status: [{"status":{"code":0,"formattedMessage":{"lang":"en-US","message":"Successfully joined machine to AAD."},"name":"Microsoft.Azure.ActiveDirectory.AADLoginForWindows","operation":"AADJoin","status":"success","substatus":null},"timestampUTC":"\/Date(1597229907302)\/","version":"1"}]

amjads1 commented 4 years ago

Thanks for the confirmation and updates @srrstepe . Appreciate the help!

@ArcturusZhang , @hugo-paredes - Based on the updates from @srrstepe , looks like the issue resolved and you should now be able to install Microsoft.Azure.ActiveDirectory.AADLoginForWindows extension without any errors. Can you please confirm from your side as well so that we can close this issue?

amjads1 commented 4 years ago

@ArcturusZhang , @hugo-paredes - Based on the above resolution and confirmation from the AAD team, this issue should be resolved now. I will go ahead and close this issue but feel free to re-open if the issue is not yet resolved. Thank you!

hugo-paredes commented 4 years ago

Hi @amjads1

I've just tried to login into a VM via AAD and I seem to be able to authenticate. However, I cannot login and I get this message:

image

Is there any other setting I need to specify to allow this?

Greetings. Hugo

ArcturusZhang commented 4 years ago

Seems there are still issues, reopen this issue...

amjads1 commented 4 years ago

Updates from @srrstepe -

** User might see this error if Conditional Access policies are configured in the customer’s tenant requiring MFA for every resource. If this is the case, did he try logging in to using Next Generation creds? If the customer does not have NGC, then they should consider excluding “Azure Windows VM Sign-In” app from the CA policy.

If none of this works, please ask the customer to collect logs using the following instructions:

  1. RDP/login to the Azure VM using the default admin user credentials.
  2. https://github.com/CSS-Windows/WindowsDiag/blob/master/ADS/AUTH/Auth.zip
  3. download the ZIP file to client and extarct it
  4. Rename start-auth.txt and stop-auth.txt to .bat files
  5. Create a folder "MSLogs" and move the start-auth.bat and stop-auth.bat
  6. Open up admin command prompt and execute the start-auth.bat
  7. Close the previous RDP session and now try RDP/login in to the Azure VM using the AAD credentials.
  8. Again close the RDP session and RDP/login using the default admin credentials.
  9. execute stop-auth.bat from admin prompt
  10. Please collect the logs it does create under folder “MSLogs"

@hugo-paredes - Can you try the steps highlighted above and update if this resolves your issue?

ghost commented 4 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @adamedx.

Drewm3 commented 4 years ago

Moving this over to AAD as this issue looks to be in the AAD extension as compared to the extension framework itself.

hugo-paredes commented 3 years ago

Hi. Is there any update to this issue? Thanks

akingscote commented 3 years ago

@hugo-paredes is this the same for yourself? Im having the same problem unless i disable MFA which i do not want to do

hugo-paredes commented 3 years ago

@akingscote I don't think so.

When I create a VM using the Azure Portal and enable the AAD login, I'm able to login to the VM without any problem. But, when I do it via the az cli, then it doesn't work.

azMantas commented 3 years ago

I am also fighting with the issue and not able to deploy AAD login extension for Win with ARM template. Event thought I am using the same image and copy all extension settings from resources.portal.com. All good with Linux vm's. Any upadtes ?
P.S. Resourcegroup name only 3 symbols, and vm name is also just a few letters

CFinley22 commented 2 years ago

any updates?

basvandesande commented 1 year ago

the error still exists.... Trying to deploy the vm + the extension using bicep.

vvasyliu commented 6 months ago

I have the same problem with the extention Windows Datacenter 2019 Tried to run command on VM manually PS C:\Packages\Plugins\Microsoft.Azure.ActiveDirectory.AADLoginForWindows\1.3.0.0> .\AADLoginForWindowsHandler.exe install and got [Error]: Could not find if this OS version supports Azure VM Secure join. Exception: System.IO.FileNotFoundException: C:\windows\system32\dsreg.dll I think this is the root cause But on requirements Windows Datacenter 2019 is supported Windows Server 2019 Datacenter and later Copy files from healthy windows to the vm system32\

After that it works image Enjoy!

825i commented 4 months ago

I am trying to install this extension:

Microsoft.Azure.ActiveDirectory.AADLoginForWindows

of version: 2.2.0

on

  source_image_reference {
    offer     = "WindowsServer"
    publisher = "MicrosoftWindowsServer"
    sku       = "2022-datacenter-azure-edition-core"
    version   = "latest"
  }

using Terraform.

However I am getting the following error:

Install failed for plugin (name: Microsoft.Azure.ActiveDirectory.AADLoginForWindows, version 2.2.0.0) with exception Command C:\Packages\Plugins\Microsoft.Azure.ActiveDirectory.AADLoginForWindows\2.2.0.0\AADLoginForWindowsHandler.exe of Microsoft.Azure.ActiveDirectory.AADLoginForWindows has exited with Exit code: 1

I can't find any information that this extension would not be supported on this Windows SKU. When trying to debug the error I found this issue tracker.

I did see that there's the resource name limit, but I assume that this has been patched in the last 4 years? Yet I can't see any confirmation from these replies that it has. My resource name is definitely over 15 characters.

Can someone please confirm that the character limit is still an issue or provide an alternative path for me to debug? I'm not currently able to reduce the resource group name because it follows a specific format of "<productname>-<dev>-<username>-<randsuffix>-<resourcetype>" and I can't edit this policy.

I've opened an issue here, as this seems to potentially be a Terraform issue: https://github.com/hashicorp/terraform-provider-azurerm/issues/25810

EDIT: Nope, this is definitely Azure's fault. Not Terraform's.

825i commented 4 months ago

Copy files from healthy windows to the vm system32\

  • dsreg.dll
  • dsreg.dll.mui
  • dsregcmd.exe
  • dsregcmd.exe.mui
  • dsregtask.dll
  • dsregtask.dll.mui

After that it works image Enjoy!

I just tried your example here (minus copying the files as I have no where else to copy them from).

I got the exact same error as you did:

2024-05-02T12:12:45.9183018Z    [Information]:  10.0.20348.1070 (WinBuild.160101.0800)
2024-05-02T12:12:45.9183018Z    [Information]:  Getting Dsregcmd capabilities.
2024-05-02T12:12:45.9183018Z    [Error]:        Could not find if this OS version supports Azure VM Secure join.

This seems to be a Microsoft-side problem entirely. This SKU should be completely supported: 2022-datacenter-azure-edition-core

825i commented 4 months ago

Ok I managed to fix this entirely by using:

2022-datacenter-azure-edition-hotpatch

It seems as though despite core supposed to being supported, it simply isn't. I had no problem doing this on a different SKU, automatically via Terraform. Microsoft simply needs to add the 2022-datacenter-azure-edition-core SKU to a list of unsupported SKUs for AADLoginForWindows extension.

I would recommend others try a different SKU if they get this issue. Your SKU is likely just not supported despite whatever Microsoft documentation says/doesn't say.