chef-partners / azure-chef-extension

The development repository for the Chef Extension available through Azure
Apache License 2.0
25 stars 21 forks source link

Install fails if MSI installer in use when Azure Extension applied #328

Open trickyearlobe opened 4 years ago

trickyearlobe commented 4 years ago

If Azure Chef Client extension is deployed to a machine along with another extension that needs the MSI installer, the Chef Client Extension fails because the MSI installer is busy.

Wait/Retries are not correctly handled in the extension, but if DINE policy (Deploy if not exists) causes a retry, the client will eventually install, but does not update its status to Azure.

ayushbhatt29 commented 4 years ago

We have tried to reproduce the issue with two extensions and both as MSI installer by deploying them simultaneously, Azure Chef Client extension was installed without failure after installation of first extension completed. Status of both extension was updated to Azure portal as success.

steps we have taken : Ran command from powershell for Azure Chef extension and deployed an extension through azure portal simultaneously command -

az vm extension set --resource-group "ash-airgap-grp" --vm-name "ayu-win1" --name ChefClient --publisher Chef.Bootstrap.WindowsAzure --version 1210.13.4.1 --no-auto-upgrade true --protected-settings "{'validation_key': '', 'client_key': '', 'client_rb': 'D:\chef\chef-repo-ash\chef-repo\chef-repo\.chef\client.rb'}" --settings "{ 'bootstrap_options': { 'chef_server_url': '', 'chef_node_name': 'ayu-newtest', 'node_ssl_verify_mode': 'none' }, 'runlist': '[recipe[cbk1::default]]', 'CHEF_LICENSE':'accept', 'chef_package_url':'https://storageblobpkgash.blob.core.windows.net/packages/chef-client-16.4.41-1-x64.msi'}"

result -

Azure Chef-Client extension was installed after the other extension got installed in the vm and both registered in Azure portal as well as node is also getting created in chef manage.

azure

We may need some more details to reproduce the issue.

trickyearlobe commented 3 years ago

https://github.com/chef-partners/azure-chef-extension/pull/344 may help with diagnosing this issue

btm commented 3 years ago

@ayushbhatt29 This working for you is a matter of fortunate timing. The Windows Installer does only allows a single instance of InstallExecuteSequence to be running at a time and other installations will fail instead of wait.

There are a number of ways to poll if Windows Installer is currently active to reduce (but not eliminate) the the likelihood of hitting the race condition. One is looking for the HKLM\Software\Microsoft\Windows\CurrentVersion\Installer\InProgress registry key (keep the wow64 registry redirector in mind). A more crude one would be to look for a currently executing msiexec process.

We should add a check for one of these and wait for completion. At least we can output some logging in this case. However there is a real risk that waiting will cause us to exceed our limited time that is allowed to us by the Azure extension framework and fail anyway.

trickyearlobe commented 3 years ago

The customer that originally reported this to me used #344 to identify that there was an MSI fight going on.

The workaround they are using in their Azure DINE policy (Deploy If Not Exist) is to

DINE policy dependencies are not always possible, so we should still make sure that we fix the retry mechanism so it correctly updates the deployed status for Azure framework.

Inconsistency in reporting deployment status when we retry the install/bootstrap is still a problem for them when the deployment fails for other reasons (Chef server offline, network problems etc)

ayushbhatt29 commented 3 years ago

Thanks @btm and @trickyearlobe, we will work on it.

TerrieB1 commented 3 years ago

From Richard Nixon: The request to add MSI installer logging was actually implemented, and Aftab managed to diagnose that there was indeed an "MSI fight" going on. UBS have now made AzureChefExtension dependent on the clashing Extension which causes them to install serially (meaning they no longer fight with each other).

We do still need to fix the problem that if the install has to be retried (typically Azure reruns it after 90 mins), the status is not correctly updated to the Azure framework, and looks as if it hasn't deployed correctly.

trickyearlobe commented 3 years ago

In addition, Aftab mentioned in https://getchef.zendesk.com/agent/tickets/27765 that the issue occurs in about 20% of cases when the following extensions are configured without the dependencies trick.

caroysMSFT commented 3 years ago

There are a number of ways to poll if Windows Installer is currently active to reduce (but not eliminate) the the likelihood of hitting the race condition. One is looking for the HKLM\Software\Microsoft\Windows\CurrentVersion\Installer\InProgress registry key (keep the wow64 registry redirector in mind). A more crude one would be to look for a currently executing msiexec process.

I believe what you are looking for is the _MSIExecute mutex

https://docs.microsoft.com/en-us/windows/win32/msi/-msiexecute-mutex

Msiexec can be running, and not actually holding the mutex. This gets you as close as you can get to avoiding the race condition.

trickyearlobe commented 3 years ago

Thanks @caroysMSFT, that's helpful. @ayushbhatt29, looks like we just need to call QueryServiceStatusEx to get the status of the MSI Installer service to avoid an MSI fight.

We still need to make sure we correctly update state info for Azure (important for reporting on deployment state in large estates)

caroysMSFT commented 3 years ago

QueryServiceStatusEx was the wrong takeaway from that article. You need to query the named system mutex "_MSIExecute" to see if someone is holding on to it.

This article is a good start

ayushbhatt29 commented 3 years ago

Thanks @caroysMSFT for the suggestion.

RoyShravani commented 2 years ago

We tried recreating the issue and introduced the MSI installer logging using

Test-Path HKLM:\Software\Microsoft\Windows\CurrentVersion\Installer\InProgress

however we were unable to reproduce the issue in hand.

While trying to install the chef extension along with other extensions which require the MSI installer, the installation would succeeded every time & the status of the MSI installer won't log a clash.

POC1

I believe we would require some more details on how to recreate this issue.