joefitzgerald / packer-windows

Windows Packer Templates
MIT License
1.64k stars 1.12k forks source link

SSH Starts before Windows Updates finish #59

Closed kensykora closed 10 years ago

kensykora commented 10 years ago

I'm pretty sure this is an issue -- in the latest version the secure delete and defragger was running while windows updates was still doing its thing. In the latest version you updated SSH, this seems to open up SSH as soon as the install finishes?

kensykora commented 10 years ago

I just ran it again and got a clean run... I'm not sure what happened to reproduce that issue. I was in headless mode. I'll try and reproduce again running headless in the future and report logs.

stonith commented 10 years ago

I'm also seeing inconsistent results from windows updates. I may end up moving it to the provisioner if I can get it working more consistently.

joefitzgerald commented 10 years ago

Windows updates run where they do (and without SSH started) to permit reboots to occur; if you move that step to provisioners, Packer will fail in the provisioning step when you need to reboot.

The update to the OpenSSH version should not have caused it to start - it simply instructed the script (https://github.com/joefitzgerald/packer-windows/blob/master/scripts/openssh.ps1) to download a newer version. In looking at the script, there may be a race condition between:

There does not appear to be a flag to prevent the installer from attempting to start sshd. Yet another case for completing a WinRM communicator for Packer.

kensykora commented 10 years ago

That's definitely it -- you're starting the process then immediately stopping it. In between those calls it's possible that packer attempts to make another SSH connection and starts doing its thing. Is there a reason you need to start it there? You already run the MSI installer, but why does it need to start? Could it be started after updates finish?

joefitzgerald commented 10 years ago

Packer isn't very ... resilient to restarts. The most common reason a restart might be required is due to Windows Updates, but they present a unique challenge. Because they change over time, there may be some unknown (n) restarts required, and so you cannot use a series of provisioners (with pause_before to allow a reboot to occur after a prior provisioner completes) to reliably install all windows updates.

Consequently, we do this in Autounattend.xml and suppress the start of OpenSSH so that Packer does not even begin provisioning until all reboots are complete (meaning all currently available Windows updates are installed). If Packer does manage to connect in the 1-5 seconds that the OpenSSH service is started, then the first provisioner may fail. That sucks, so we may need to move the OpenSSH install to occur as a result of the updates PowerShell script.

Or, we could do as @sneal suggests and complete the WinRM communicator for Packer but we will still need to ensure that we enable WinRM without starting it until updates are complete.

stonith commented 10 years ago

We might be able to (ugly) hack around this temporarily by putting in "stop-process -force -processname sshd" in the win_updates.ps1 file and have a pause_before in the first provisioner block. I'll see if I can get this to work, Ultimately, the WinRM communicator is ideal although I believe there's restrictions for running windows updates remotely.

stonith commented 10 years ago

@joefitzgerald What specific issues did you run into with restarts? I've found killing openssh hard with taskkill is required for the ssh connection to close gracefully on the packer side and increasing the start_retry_timeout to account for the reboots seem to work. Stopping the opensshd service will cause the packer provisioner to hang.

dylanmei commented 10 years ago

Oddly, I could repro with VMWare but not VirtualBox. Rolling back to 6.4p1-1 fixes the problem for me. I'm not sure what relevant thing we gained by upgrading:

Add in a remove for the firewall rules during uninstall. Fixed firewall rules for WinXP. Updates to OpenSSL 1.0.1g-1 to address the Heartbleed vulnerability

joefitzgerald commented 10 years ago

@dylanmei seriously? We gained a heartbleed vulnerability fix :heart: :hurtrealbad: ...

:wink:

dylanmei commented 10 years ago

My limited understanding is that OpenSSH doesn't use TLS, just the key-generation bits of OpenSSL.

As I see it, the reboots don't matter -- OpenSSH should not be running when win-updates.ps1 kicks in. That's what the -Wait and the Stop-Service -Force is for. I dread diving back into that installer code to see what's changed. :frowning:

joefitzgerald commented 10 years ago

Yeah the issue isn't the ps1... It's the fact that the service auto starts in the installer and there is no way to install it without it starting. While we try to kill it immediately, it represents a race condition.

Fair comment on the OpenSSH / heartbleed thing. Still, who knows what someone will use the installed version for if we leave it in a vulnerable state.

StefanScherer commented 10 years ago

Just an idea. What about closing port 22 in windows firewall (adding a deny rule) before installing OpenSSH? After provisioning (while keeping the service running behind the FW) or after stopping the ssh service this firewall rule could be removed again.

joefitzgerald commented 10 years ago

Great idea. I think it would work, too.

stonith commented 10 years ago

I was able to get the windows updates to install properly with reboots in the provisioner section with the changes here: https://github.com/stonith/packer-windows/commits/master In my situation I need to run the updates after the chef run anyhow as we install features/roles via chef which then require updates.

dylanmei commented 10 years ago

@stonith I like this idea very much. I hope to try it out very soon.

joefitzgerald commented 10 years ago

Closing this in deference to #96.