chef-boneyard / push-jobs-cookbook

Development repository for Chef Cookbook push-jobs
https://supermarket.chef.io/cookbooks/push-jobs
Apache License 2.0
21 stars 43 forks source link

Rollback of MSI can occur on Windows even if the service is already installed #109

Closed DavidR91 closed 7 years ago

DavidR91 commented 7 years ago

Cookbook version

3.2.2

Chef-client version

12.16.42

Platform Details

Windows Server 2012 R2 (mixture of on-prem and Azure)

Scenario:

If the service is already installed, leaving the job in the runlist should present no issue: nothing should happen. 99% of the time, this is true.

But, occasionally it seems possible for the MSI to hit a 1603, for unknown reasons (I'm assuming this may be caused by pending installs/updates happening elsewhere on the system at the time the job runs?)

Mixlib::ShellOut::ShellCommandFailed: chef_ingredient[push-jobs-client] (push-jobs::package line 44) had an error: Mixlib::ShellOut::ShellCommandFailed: windows_package[push-jobs-client] 

(c:/chef/cache/cookbooks/chef-ingredient/libraries/default_handler.rb line 51) had an error: Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1603' 

---- Begin output of msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" 
---- STDOUT: STDERR: 
---- End output of msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" 
---- Ran msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" returned 1603> 
had an error:
chef_ingredient[push-jobs-client] (push-jobs::package line 44) 
had an error:
 Mixlib::ShellOut::ShellCommandFailed: windows_package[push-jobs-client] (c:/chef/cache/cookbooks/chef-ingredient/libraries/default_handler.rb line 51) 
had an error: 
Mixlib::ShellOut::ShellCommandFailed: Expected process to exit with [0], but received '1603' 
---- Begin output of msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" 
---- STDOUT: STDERR: 
---- End output of msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" 
---- Ran msiexec /qn /i "c:\chef\cache\push-jobs-client-2.1.4-1-x86.msi" returned 1603

it seems as though hitting this error actually causes the entire package to rollback, and it uninstalls the service entirely - which is needless to say, not very helpful

Steps to Reproduce:

I don't have a definitive method of producing 1603 - but create some kind of MSI error, and you can get rollbacks

Expected Result:

Failure of the install step without rollback if the service is already installed.

Should there be a more resilient 'is this already installed' check before the MSI is kicked off? (So MSI is not even invoked if it is present)

Actual Result:

Install was rolled back unexpectedly

smurawski commented 7 years ago

I'm sorry you are hitting this. Unfortunately, this is a challenge with MSI's in general. When the install fails and it attempts to roll back, it'll try to undo all the features that were configured. This can lead to a service being uninstalled.

DavidR91 commented 7 years ago

@smurawski I totally understand the lack of desire to change the MSI, as it is ridiculously difficult to get them to rollback properly.

Is there any way a compromise can be made instead? For example - to provide optional recipes out-of-the-box which store state data on nodes for whether the client is already installed? Or to provide a wrapper which does an is_installed-esque check before running? (and to totally avoid executing the job if the client is already installed)

I know such a thing is not considered great practice, but this problem is actually very common in our environment (~20 nodes, mixture of Win Server 2012 + Server 2016, some in Azure and some on-prem).

I'm reasonably sure that MSI misbehaviour (when running it on a machine where it is already present) is also the main cause behind our unexplained service stoppages

We're constantly being hit by machines being unavailable for push, and in almost all cases, it's because either the service has been uninstalled during rollback, or it has been automatically stopped (without error)