hashicorp / packer

Packer is a tool for creating identical machine images for multiple platforms from a single source configuration.
http://www.packer.io
Other
15.11k stars 3.33k forks source link

First powershell script after installing powershell 5 and rebooting doesn't run #5064

Closed matt-richardson closed 7 years ago

matt-richardson commented 7 years ago

I've got a weird situation with packer 1.0.2, which makes it not run a script after powershell 5 is installed and the guest is rebooted. Subsequent files are executed okay.

In the output, it should have Installing Microsoft Windows SDK for Windows 10 10.0.26624, and also should actually install the windows-sdk-10.0 from chocolatey. However, it does not.

The log says there's 0 bytes returned on STDERR and STDOUT.

When I log onto the box, and run c:\windows\temp\script.ps1, it works fine - file has uploaded okay.

If my packer file has the same file listed twice it works:

      scripts": [
        "./scripts/install-win10-sdk.ps1"
        "./scripts/install-win10-sdk.ps1"
      ],

templates, scripts & logs: https://gist.github.com/matt-richardson/bee5b3fb7de5f5c0b78cc92884dd27e4

rickard-von-essen commented 7 years ago

Install without reboot and then use the windows-restart provisioner and then run the second script.

matt-richardson commented 7 years ago

That's what I'm doing? First script install-powershell5.ps1 returns an exit code of 3010, then I run the windows-restart provisioner, then even though the packer.json file says run install-win10-sdk.ps1, it runs install-powershell5.ps1 instead.

matt-richardson commented 7 years ago

@rickard-von-essen, have you been able to reproduce this?

rickard-von-essen commented 7 years ago

Haven't tried, most Windows things I leave to HashiCorp. But I try to help out with triaging a bit.

rickard-von-essen commented 7 years ago

(Fat fingers)

matt-richardson commented 7 years ago

Ahh, my mistake, I assumed you were part of the team.

taliesins commented 7 years ago

@matt-richardson I am assuming the windows-restart makes use of powershell to restart the box.

I think that is probably best to install powershell 5 via autounattend.xml

e.g. https://github.com/taliesins/packer-baseboxes/blob/master/windows/windows-10-amd64/Autounattend.xml#L356

matt-richardson commented 7 years ago

I've changed to use this workaround, but this workaround isn't much better than my workaround of listing the file twice... Still think there's some kind of bug there.

SwampDragons commented 7 years ago

I'll try to take a look and see if I can reproduce soon.

mattzimmerman commented 7 years ago

@matt-richardson I've been seeing some extremely weird behaviors as well, I'm on 1.0.3 but it appears to be a similar manifestation. My Windows AMI build pipeline is setup in an upstream baseline build, with downstream builds that directly ingest those AMIs as the new source_ami. With this setup (which effectively does a restart in the same way that you're seeing), I'll see an early action in a script (though not always the first) simply fail to execute, output zero logging, and continue to the next provisioner step.

Out of desperation/frustration having messed with this for several days with no progress, I hit upon double executing as you did. I guessed that perhaps simply waiting would resolve the issue, and lo and behold a Start-Sleep -s 60 inserted before the failing line of the script magically resolved all my issues.

I suspect it's due to modules not loading properly/quickly. Simple native PS commands that don't require a module call appear to work, but the first call that needs to load a module fails. I first discovered this attempting to Import-Module ServerManager, and then noticed that the first immediate Install-WindowsFeature call would silently exit the script.

matt-richardson commented 7 years ago

Ahh, thats interesting... I'm also now on 1.0.3 and still seeing it.

I've also got a (potentially) related issue where it randomly doesn't run a script. It uploads it, says it executes, but doesn't, and then moves onto the next one. I haven't been able to setup anything where it consistently reproduces so I haven't logged as yet. I'll see if adding a sleep into one of the scripts assists.

Whatever the issue is, it could be exacerbated by the "helpful" powershell feature where it still returns an exit code of 0, even if the script fails.

mattzimmerman commented 7 years ago

Even more interesting, I'm seeing some file provisioners just terminate mid-copy. I was able to repro this behavior late yesterday with an extremely simple test case using my Pester tests directory.

Including this provisioner:

    {
      "type": "file",
      "source": "./pester/",
      "destination": "C:\\temp\\pester"
    },

Terminates before copying a Pester.Utilities.ps1 file in the ./pester directory.

But if I immediately follow that script with an explicit copy of the missing file, like so:

    {
      "type": "file",
      "source": "./pester/Pester.Utilities.ps1",
      "destination": "C:\\temp\\pester\\Pester.Utilities.ps1"
    },

Then it works and the files are copied to the builder instance as I would expect.

matt-richardson commented 7 years ago

If you've managed to put together a simple repro, then definitely worth logging a issue with the details.

jcoconnor commented 7 years ago

Behaviour I'm seeing may be connected. This has started since I upgraded to Packer 1.0.3

I am seeing random instances of powershell scripts not being run after a restart. it appears to run normally on a re-run.

Typical sequence is:

  "provisioners": [
    {
      "type": "powershell",
      "inline": [
        "md -Path C:\\Packer\\puppet\\modules",
        "md -Path C:\\Packer\\Downloads",
        "md -Path C:\\Packer\\Downloads\\Cygwin",
        "md -Path C:\\Packer\\Init",
        "md -Path C:\\Packer\\Logs",
        "md -Path C:\\Packer\\Sysinternals"
      ]
    },
    {
      "type": "powershell",
      "inline": [
        "A:\\install-win-packages.ps1"
      ]
    },
    {
      "type": "windows-restart"
    },
    {
      "type": "powershell",
      "inline": [
        "A:\\install-cygwin.ps1"
      ],
      "environment_vars": [
        "QA_ROOT_PASSWD={{user `qa_root_passwd`}}"
      ]
    },
    {
      "type": "windows-restart"
    },
    {
      "type": "file",
      "source": "../../manifests/windows/",
      "destination": "C:\\Packer\\puppet\\modules"
    },
    {
      "type": "file",
      "source": "files/win-7-x86_64-std.pp",
      "destination": "C:\\Packer\\puppet\\win-7-x86_64-std.pp"
    },
    {
      "type": "powershell",
      "inline": [
        "A:\\puppet-configure.ps1 -PackerTemplateName {{user `template_name`}}-{{user `version`}} -PackerSHA '{{user `packer_sha`}}'"
      ]
    },

Other observation - seem to be collecting a lot of processes when running these scripts, e.g.:

jenkins   3028  2708  0 03:40 ?        00:00:00 /bin/bash /tmp/hudson3634282379285711294.sh
jenkins   3266  3028  0 03:40 ?        00:00:00 /opt/packer/packer build --force x86_64.vmware.vsphere.cygwin.json
jenkins   3276  3266  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer build --force x86_64.vmware.vsphere.cygwin.json
jenkins   3287  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-builder-vmware-vmx
jenkins   3300  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3316  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3330  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-windows-restart
jenkins   3344  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3357  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-windows-restart
jenkins   3372  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-file
jenkins   3386  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-file
jenkins   3400  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3414  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-windows-restart
jenkins   3428  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3441  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-windows-restart
jenkins   3456  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3469  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-windows-restart
jenkins   3482  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-file
jenkins   3496  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-file
jenkins   3511  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-provisioner-powershell
jenkins   3522  3276  0 03:40 ?        00:00:00 /opt/packer_1.0.3_linux_amd64/packer plugin packer-post-processor-vsphere
jenkins   3789     1 80 03:43 ?        00:00:17 /usr/lib/vmware/bin/vmware-vmx -s vmx.noUIBuildNumberCheck=TRUE -# product=1;name=VMware Workstation;version=10.0.6;buildnumber=2700073;licensename=VMware Workstation;licenseversion=10.0+; -@ duplex=3;instance=0;msgs=ui /srv/localvms/output/output-windows-8.1-x8
[johnoconnor@imaging-builder3-prod ~]$
jcoconnor commented 7 years ago

^^^ - This is running Powershell 2 - build here is a Windows-7 build.

jcoconnor commented 7 years ago

@mattzimmerman I've adopted your suggestion but used a 30 second sleep - seems to help overall build stability.

jcoconnor commented 7 years ago

@SwampDragons Have you had a chance to investigate this further. I've used the Start-Sleep command as suggested by @mattzimmerman but its a kind of ugly workaround.

Tx John

SwampDragons commented 7 years ago

I haven't yet, but various powershell bugs are going to be my main focus for the next couple of days. And thanks -- it may be an ugly workaround but it's a useful clue. I'll see if there's some native way for us to know when everything's been loaded properly.

jcoconnor commented 7 years ago

Thanks @SwampDragons Will let you know if I find any better workarounds too. Am holding off any further template scripting for a few days until I get our current images deployed (Patch Tuesday was last week after all) but should be doing further development next week.

matt-richardson commented 7 years ago

@SwampDragons - I'm happy to help test if you want. I'm pretty eager to get the powershell story in Packer more solid.

SwampDragons commented 7 years ago

Thanks! I've had a bear of a time getting a windows 7 vm up and running for testing in between other things, but I think I'm finally on my way. Once I have some changes I'll be sure to link PRs for you to test.

vijayinvites commented 7 years ago

@matt-richardson Add the following restart provisioner after installing WMF 5.0. We had similar issues. After the following change, it worked fine.

{ "type": "windows-restart", "restart_command": "shutdown -r -f -t 10 && sc config winrm start= delayed-auto && net stop winrm", "restart_timeout": "10m" }

matt-richardson commented 7 years ago

Thanks @vijayinvites. It's still a workaround though. This issue is so we can resolve at source, rather than all of us hitting it and having to come up with workarounds.

jcoconnor commented 7 years ago

I should add that I'm not using WMF5 across all of our builds but still seeing the failure. The two WMF5 builds that we do generate on Win-2008r2 and Win-2012r2 have the WMF5 MSU added during the boxstarter phase (i.e. before winrm comes into play).

SwampDragons commented 7 years ago

@matt-richardson How often was the repro case you gave in the original issue post failing on 1.0.2? I just ran it twice and the SDK appears to have installed both times.

matt-richardson commented 7 years ago

My memory is hazy, but looking back through the teamcity logs, it looks like it was failing every time. From what I wrote above, it appears I was able to get it to fail reliably on my mac as well.

Did it output Installing Microsoft Windows SDK for Windows 10 10.0.26624 to stdout?

Don't think it'll be relevant, but when running under teamcity, it was happening on ubuntu/images/ebs-ssd/ubuntu-xenial-16.04-*, which evaluated at that time to ami-37726254.

SwampDragons commented 7 years ago

Okay, I just reproduced by switching datacenters from the Australia one in the original repro case to the one closest to me. The time lag between me and Sydney must have been great enough that I avoided whatever race condition the sleep workaround is brute-force avoiding. And now that I have a repro I can finally get going on trying solutions.

SwampDragons commented 7 years ago

I have a hunch. If the packer config doesn't contain a "remote_path", we upload scripts to the same default location: "remote_path": "c:/Windows/Temp/script.ps1" . I'm thinking there's some cacheing going on that's rerunning the first powershell script instead of the second, even though we've ostensibly replaced "script.ps1" with the new script.

Lucky for us, this is testable via a workaround: Can you add a remote_path to each of your powershell provisioner clauses in the packer config and see if this stabilizes things? E.g. "remote_path": "c:/Windows/Temp/script_SDK.ps1" and "remote_path": "c:/Windows/Temp/script_POWERSHELL5.ps1" . If that resolves the problem, I'm going to change the default to be unique for each provisioner.

It seems to have fixed things for me but my internet connection is slow enough today thanks to eclipse tourism that it could just be that I've got the equivalent of a sleep. I'll start in on coding up the unique upload name while y'all test the config workaround.

jcoconnor commented 7 years ago

That's interesting @SwampDragons. I was thinking of doing this anyway as it allows better visibility of the script name running.

May get a chance to run the builds over the weekend.

jcoconnor commented 7 years ago

That workaround is in place - initial results look promising. This is probably completely unrelated, but randomly at the restart, a lot of XML is placed in the log file. Its mildly irritating but not a huge issue:

e.g.

13:24:32     windows-2008r2-wmf5-x86_64-vmware-vsphere.cygwin:
13:24:32     windows-2008r2-wmf5-x86_64-vmware-vsphere.cygwin: Syspep fix for WMF5/Windows 2008R2
13:24:32     windows-2008r2-wmf5-x86_64-vmware-vsphere.cygwin: The operation completed successfully.
13:24:32     windows-2008r2-wmf5-x86_64-vmware-vsphere.cygwin: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><Obj S="information" RefId="1"><TN RefId="1"><T>System.Management.Automation.InformationRecord</T><T>System.Object</T></TN><ToString>Other Stuff.......</ToString><Props><Obj N="MessageData" RefId="2"><TN RefId="2"><T>System.Management.Automation.HostInformationMessage</T><T>System.Object</T></TN><ToString>Other Stuff.......</ToString><Props><S N="Message">Other Stuff.......</S><B N="NoNewLine">false</B><S N="ForegroundColor">Gray</S><S N="BackgroundColor">Black</S></Props></Obj><S N="Source">A:\config-winsettings.ps1</S><DT N="TimeGenerated">2017-08-18T20:24:29.6875+00:00</DT><Obj N="Tags" RefId="3"><TN RefId="3"><T>System.Collections.Generic.List`1[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]</T><T>System.Object</T></TN><LST><S>PSHOST</S></LST></Obj><S N="User">WINPUPP-8D323HI\Administrator</S><S N="Computer">WINPUPP-8D323HI</S><U32 N="ProcessId">2704</U32><U32 N="NativeThreadId">2752</U32><U32 N="ManagedThreadId">7</U32></Props></Obj><Obj S="information" RefId="4"><TNRef RefId="1" /><ToString>Enable Bootlog</ToString><Props><Obj N="MessageData" RefId="5"><TNRef RefId="2" /><ToString>Enable Bootlog</ToString><Props><S N="Message">Enable Bootlog</S><B N="NoNewLine">false</B><S N="ForegroundColor">Gray</S><S N="BackgroundColor">Black</S></Props></Obj><S N="Source">A:\config-winsettings.ps1</S><DT N="TimeGenerated">2017-08-18T20:24:29.734375+00:00</DT><Obj N="Tags" RefId="6"><TNRef RefId="3" /><LST><S>PSHOST</S></LST></Obj><S N="User">WINPUPP-8D323HI\Administrator</S><S N="Computer">WINPUPP-8D323HI</S><U32 N="ProcessId">2704</U32><U32 N="NativeThreadId">2752</U32><U32 N="ManagedThreadId">7</U32></Props></Obj><Obj S="information" RefId="7"><TNRef RefId="1" /><ToString>Setting IE Home Page</ToString><Props><Obj N="MessageData" RefId="8"><TNRef RefId="2" /><ToString>Setting IE Home Page</ToString><Props><S N="Message">Setting IE Home Page</S><B N="NoNewLine">false</B><S N="ForegroundColor">Gray</S><S N="BackgroundColor">Black</S></Props></Obj><S N="Source">A:\config-winsettings.ps1</S><DT N="TimeGenerated">2017-08-18T20:24:30.046875+00:00</DT><Obj N="Tags" RefId="9"><TNRef RefId="3" /><LST><S>PSHOST</S></LST></Obj><S N="User">WINPUPP-8D323HI\Administrator</S><S N="Computer">WINPUPP-8D323HI</S><U32 N="ProcessId">2704</U32><U32 N="NativeThreadId">2752</U32><U32 N="ManagedThreadId">7</U32></Props></Obj><Obj S="information" RefId="10"><TNRef RefId="1" /><ToString>Setting Default User registry entry: Software\Microsoft\Internet Explorer\Main\Start Page</ToString><Props><Obj N="MessageData" RefId="11"><TNRef RefId="2" /><ToString>Setting Default User registry entry: Software\Microsoft\Internet Explorer\Main\Start Page</ToString><Props><S N="Message">Setting Default User registry entry: Software\Microsoft\Internet E
matt-richardson commented 7 years ago

Not looking convincing from my side:

2017/08/21 15:30:05 ui: ==> amazon-ebs: Provisioning with shell script: ./scripts/install-win10-sdk.ps1
2017/08/21 15:30:05 packer: 2017/08/21 15:30:05 Opening ./scripts/install-win10-sdk.ps1 for reading
2017/08/21 15:30:05 packer: 2017/08/21 15:30:05 Building command line for: if (Test-Path variable:global:ProgressPreference){$ProgressPreference='SilentlyContinue'};$env:PACKER_BUILDER_TYPE="amazon-ebs"; $env:PACKER_BUILD_NAME="amazon-ebs"; $env:admin_password="Secr3tP@ssword"; $env:admin_user="test-user"; &'c:/Windows/Temp/script_SDK.ps1';exit $LastExitCode
2017/08/21 15:30:05 packer: 2017/08/21 15:30:05 Uploading file to 'c:/Windows/Temp/script_SDK.ps1'
2017/08/21 15:30:05 [INFO] 281 bytes written for 'uploadData'
2017/08/21 15:30:05 packer: 2017/08/21 15:30:05 [INFO] 281 bytes written for 'uploadData'
2017/08/21 15:30:08 packer: #< CLIXML
2017/08/21 15:30:10 packer: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj></Objs>2017/08/21 15:30:10 [INFO] starting remote command: powershell -executionpolicy bypass -encodedCommand aQBmACAAKABUAGUAcwB0AC0AUABhAHQAaAAgAHYAYQByAGkAYQBiAGwAZQA6AGcAbABvAGIAYQBsADoAUAByAG8AZwByAGUAcwBzAFAAcgBlAGYAZQByAGUAbgBjAGUAKQB7ACQAUAByAG8AZwByAGUAcwBzAFAAcgBlAGYAZQByAGUAbgBjAGUAPQAnAFMAaQBsAGUAbgB0AGwAeQBDAG8AbgB0AGkAbgB1AGUAJwB9ADsAJABlAG4AdgA6AFAAQQBDAEsARQBSAF8AQgBVAEkATABEAEUAUgBfAFQAWQBQAEUAPQAiAGEAbQBhAHoAbwBuAC0AZQBiAHMAIgA7ACAAJABlAG4AdgA6AFAAQQBDAEsARQBSAF8AQgBVAEkATABEAF8ATgBBAE0ARQA9ACIAYQBtAGEAegBvAG4ALQBlAGIAcwAiADsAIAAkAGUAbgB2ADoAYQBkAG0AaQBuAF8AcABhAHMAcwB3AG8AcgBkAD0AIgBQAGEAcwBzAHcAbwByAGQAMAAxACEAIgA7ACAAJABlAG4AdgA6AGEAZABtAGkAbgBfAHUAcwBlAHIAPQAiAHQAZQBzAHQALQB1AHMAZQByACIAOwAgACYAJwBjADoALwBXAGkAbgBkAG8AdwBzAC8AVABlAG0AcAAvAHMAYwByAGkAcAB0AF8AUwBEAEsALgBwAHMAMQAnADsAZQB4AGkAdAAgACQATABhAHMAdABFAHgAaQB0AEMAbwBkAGUA
2017/08/21 15:30:12 packer: 2017/08/21 15:30:12 [INFO] command 'powershell -executionpolicy bypass -encodedCommand aQBmACAAKABUAGUAcwB0AC0AUABhAHQAaAAgAHYAYQByAGkAYQBiAGwAZQA6AGcAbABvAGIAYQBsADoAUAByAG8AZwByAGUAcwBzAFAAcgBlAGYAZQByAGUAbgBjAGUAKQB7ACQAUAByAG8AZwByAGUAcwBzAFAAcgBlAGYAZQByAGUAbgBjAGUAPQAnAFMAaQBsAGUAbgB0AGwAeQBDAG8AbgB0AGkAbgB1AGUAJwB9ADsAJABlAG4AdgA6AFAAQQBDAEsARQBSAF8AQgBVAEkATABEAEUAUgBfAFQAWQBQAEUAPQAiAGEAbQBhAHoAbwBuAC0AZQBiAHMAIgA7ACAAJABlAG4AdgA6AFAAQQBDAEsARQBSAF8AQgBVAEkATABEAF8ATgBBAE0ARQA9ACIAYQBtAGEAegBvAG4ALQBlAGIAcwAiADsAIAAkAGUAbgB2ADoAYQBkAG0AaQBuAF8AcABhAHMAcwB3AG8AcgBkAD0AIgBQAGEAcwBzAHcAbwByAGQAMAAxACEAIgA7ACAAJABlAG4AdgA6AGEAZABtAGkAbgBfAHUAcwBlAHIAPQAiAHQAZQBzAHQALQB1AHMAZQByACIAOwAgACYAJwBjADoALwBXAGkAbgBkAG8AdwBzAC8AVABlAG0AcAAvAHMAYwByAGkAcAB0AF8AUwBEAEsALgBwAHMAMQAnADsAZQB4AGkAdAAgACQATABhAHMAdABFAHgAaQB0AEMAbwBkAGUA' exited with code: 0
2017/08/21 15:30:12 packer: 2017/08/21 15:30:12 [INFO] RPC endpoint: Communicator ended with: 0
2017/08/21 15:30:12 [INFO] 0 bytes written for 'stdout'
2017/08/21 15:30:12 [INFO] 0 bytes written for 'stderr'
2017/08/21 15:30:12 [INFO] RPC client: Communicator ended with: 0
2017/08/21 15:30:12 [INFO] RPC endpoint: Communicator ended with: 0
2017/08/21 15:30:12 packer: 2017/08/21 15:30:12 [INFO] 0 bytes written for 'stderr'
2017/08/21 15:30:12 packer: 2017/08/21 15:30:12 [INFO] 0 bytes written for 'stdout'
2017/08/21 15:30:12 packer: 2017/08/21 15:30:12 [INFO] RPC client: Communicator ended with: 0
2017/08/21 15:30:12 [INFO] (telemetry) ending powershell

Took 12 seconds from start to finish, and didn't write any output :(

jcoconnor commented 7 years ago

I ran our builds twice (16 image builds x 2) and all appear to have built successfully with the suggested workaround in place.

One observation - I was hoping that if I specified the path, it would be used for the temp file on the controlling system, or some indication that the specified path name was being run on the target system. E.g.

00:05:17 ==> windows-10-i386-vmware-vsphere.cygwin: Machine successfully restarted, moving on
00:05:17 ==> windows-10-i386-vmware-vsphere.cygwin: Provisioning with Powershell...
00:05:17 ==> windows-10-i386-vmware-vsphere.cygwin: Provisioning with shell script: /tmp/packer-powershell-provisioner412073830

Would be very useful if the target script name was actually referenced here:

    {
      "type": "powershell",
      "remote_path": "c:/Windows/Temp/run-install-win-packages.ps1",
      "inline": [
        "A:\\install-win-packages.ps1"
      ]
    },
SwampDragons commented 7 years ago

@jcoconnor That's a good bug/feature request for another issue (since this one seems to be uncovering several underlying problems, I'd like to separate them so we make sure they don't fall through the cracks); at a glance it appears that your observation is specific to inline scripts whereas the file provisioner does extract them to the desired path.

SwampDragons commented 7 years ago

@matt-richardson this is crazymaking. The naming thing was working for me fairly consistently and then suddenly started failing with an error that implies powershell doesn't recognize the file as a script, or can't find it:

amazon-ebs: <Objs Version="1.1.0.1" xmlns="http://schemas.microsoft.com/powershell/2004/04"><Obj S="progress" RefId="0"><TN RefId="0"><T>System.Management.Automation.PSCustomObject</T><T>System.Object</T></TN><MS><I64 N="SourceId">1</I64><PR N="Record"><AV>Preparing modules for first use.</AV><AI>0</AI><Nil /><PI>-1</PI><PC>-1</PC><T>Completed</T><SR>-1</SR><SD> </SD></PR></MS></Obj><S S="Error">&amp; : The term 'c:/Windows/Temp/script_SDK.ps1' is not recognized as the name of _x000D__x000A_</S><S S="Error">a cmdlet, function, script file, or operable program. Check the spelling of _x000D__x000A_</S><S S="Error">the name, or if a path was included, verify that the path is correct and try _x000D__x000A_</S><S S="Error">again._x000D__x000A_</S><S S="Error">At line:1 char:232_x000D__x000A_</S><S S="Error">+ ... env:admin_user="test-user"; &amp;'c:/Windows/Temp/script_SDK.ps1';exit $L ..._x000D__x000A_</S><S S="Error">+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~_x000D__x000A_</S><S S="Error"> + CategoryInfo : ObjectNotFound: (c:/Windows/Temp/script_SDK.ps1: _x000D__x000A_</S><S S="Error"> String) [], CommandNotFoundException_x000D__x000A_</S><S S="Error"> + FullyQualifiedErrorId : CommandNotFoundException_x000D__x000A_</S><S S="Error"> _x000D__x000A_</S></Objs>

Even after letting this percolate all weekend I'm still coming up short on ideas. I'm going to run a few more builds, drop in some notes, and pick up a new ticket while I chew over this some more.

SwampDragons commented 7 years ago

I'm wondering if the upload function is sometimes returning before the file is truly available. Not sure how to test for this other than adding a sleep; not sure how to avoid it a better way than a sleep. https://github.com/hashicorp/packer/blob/1de4be048321fa74b5f57b52b01cd267b8f7b6a5/provisioner/powershell/provisioner.go#L271

SwampDragons commented 7 years ago

I'm pretty sure the script name issue is legit, and that there's another issue which is cropping up behind it . Script name change is in PR 5272 above. Working on the windows CommandNotFoundException now.

matt-richardson commented 7 years ago

@SwampDragons

I'm wondering if the upload function is sometimes returning before the file is truly available. Not sure how to test for this other than adding a sleep; not sure how to avoid it a better way than a sleep.

I've seen a related issue where it sometimes uploads a 0 byte file (I think i mentioned it above - unfortunately couldn't get a solid repro so never logged an issue). I had a thought that it might be worth using a checksum or something to validate that the file uploaded correctly?

SwampDragons commented 7 years ago

I like the checksum idea.

jcoconnor commented 7 years ago

@SwampDragons Tx makes sense - #5275 logged

SwampDragons commented 7 years ago

Kay I've found the root cause; the windows restart provisioner checks that it can run a command via winrm but doesn't make sure that command actually works; we're still loading the powershell modules when the windows restart provisoner returns. I'm working the code to parse stderr to make sure we aren't getting error responses. I'll hopefully have a PR later today.

SwampDragons commented 7 years ago

Okay, I've finally got a fix. It's pushed into #5272 above.

jcoconnor commented 7 years ago

@SwampDragons Thanks for the update and good work. Is there a build available to test this with as would be useful to run it through our build process?

SwampDragons commented 7 years ago

Try packer.tar.gz

matt-richardson commented 7 years ago

@SwampDragons - I've tried it out, and found that it now convincingly runs the scripts (ie, the echo "##teamcity... line works), but on the first and third run, it bizarrely didn't actually complete the script - it exited with exit code 0 part way through. Logging onto the box, it didn't have chocolatey.exe running, nor did the chocolatey logs have anything in them. Very odd. The second run succeeded as expected though. I'll investigate further tomorrow.

SwampDragons commented 7 years ago

That's strange... did you not see this behavior before when you were using your double-run workaround?

matt-richardson commented 7 years ago

No - haven't seen this before, i think. Actually, given that we've had a lot of issues around powershell, I may have, but I probably assumed it was the "not all stdout being returned" issue instead.

Side note, ❤️ the work you're doing here - you're solving my last few pain points with packer - thanks!

SwampDragons commented 7 years ago

preens thanks :) . I think the way I wanna move forward here is to open a separate issue for the chocolatey crash, since I probably won't have a chance to get to it before 1.1.0 is released and the PR that addresses the initial problem in this issue has been merged. Once I've finished testing and merging all the PRs we have slated for 1.1.0 I'll see if I can sink more time into the remaining weirdness, but I'm going to bet that won't go live until at least 1.1.1

matt-richardson commented 7 years ago

I'm cool with that approach - I'll monitor/try and investigate further and log something if it appears to be causing pain. I think that 1.1.0 should solve the majority of my ami build failures I'm facing, so will be happy to see that released.

jcoconnor commented 7 years ago

Hi @SwampDragons Apologies I never got to run the test build in our env - was hit with a bunch of other tasks in the meantime.

I see #5272 has made it into 1.1.0 - will download and test this in the next few days.

Tx for the work on it.

John

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.