Closed stonith closed 9 years ago
Sent an email to the mls-software openssh maintainer to see if he/she has any ideas. I tried the same thing on cygwin sshd and the exit code is passed back properly.
I worked through this problem myself some time ago. The root cause is the default shell, bash.exe, is not actually bash (that is sh.exe). It is a small (IMO, broken) C program that starts either sh.exe or cmd.exe depending on whether command line arguments were provided (starts cmd.exe if no args). It seems the intent is to start cmd.exe as the interactive shell and use bash for remote commands so when something like scp or sftp make the connection, they get sh.exe for compatibility as those programs are hard-coded to run a remote command with Unix shell semantics.
You can view the source of the C program. I believe that it is called switch.c and is in the OpenSSH bin directory. If you look at it, you will find that it uses "system" to spawn any commands specified on the command line and then always returns 0. (Another consequence of using system, is that it adds yet another layer for which you have to escape backslashes.)
I have worked around the problem by installing our own bash.exe wrapper based upon a rewritten switch.c that uses "execv" instead of "system" to spawn the correct shell. If you are willing to always work from within bash, you could probably resolve the issue by copying sh.exe to bash.exe or by modifying etc/passwd to use sh.exe as the shell instead of bash.exe, although I have not tried this.
I am also happy to share the source for our version of switch.c if you would like to use it or customize it for your environment.
Interesting workaround from @paul-palmer -- although I think this is just one of those matters of Packer was built around SSH and until we get native winrm out of it, it's going to be difficult to work around these kinds of issues.
Hopefully in the next year or so we will start seeing improvements in this space.
For some reason I can't recall, I changed my shell to bash. If you look at the comments in scripts/openssh.ps1 /bin/sh is used specifically to get the exit codes.
@paul-palmer I am interested in the corrected switch.c or/and the final wrapper bash.exe. I added some vagrant-serverspec tests to build my windows boxes as well. For this I had to keep the default bash.exe instead of the sh.exe to make this plugin work. But I also found out that the exit code is always 0 and all tests pass. This plugin uses only SSH at the moment, but serverspec itself is able to use WinRM. So perhaps I dig into that plugin's source and fix it to use WinRM in the long term.
But your fixed bash.exe could help me for now.
Here is the version of switch.c we use. We simply compile it and copy it over bash.exe after we install openSSH. As long as you strip symbols, it should be quite small; 7K or so. I didn't add any copyright restrictions (it is really just a small amount of glue code). Feel free to reuse or edit however you wish.
/**
*** Inspired by Mark Bradshaw's(mark@networksimplicity.com) switch program.
*** This shell wrapper attempts to solve the same problems in a simpler way
*** that yields fewer surprises in normal usage and also returns the exit status
*** of the shell invoked. A proper exit status is required so that the packer
*** utility can build out Windows VMs reliably.
**/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
const char* get_comspec(void)
{
// By default, we start the Windows command processor
const char* comspec = getenv ("COMSPEC");
// no COMSPEC? Build one for the default location
if (comspec == 0)
{
const char* preamble = "COMSPEC=";
const char* systemroot = getenv("SystemRoot");
const char* path = "\\system32\\cmd.exe";
char* comspec_env;
size_t max_size;
// No SystemRoot? Assume C:
if (systemroot == 0)
systemroot = "C:\\Windows";
max_size = strlen(preamble) + strlen(systemroot) + strlen(path) + 1/*null*/;
comspec_env = (char*)malloc(max_size);
snprintf(comspec_env, max_size, "%s%s%s", preamble, systemroot, path);
putenv(comspec_env);
comspec = getenv ("COMSPEC");
}
return comspec;
}
int main (int argc, char **argv)
{
// need Bourne shell to support sftp and scp
if (argc > 1 && argv[1][0] == '-')
{
argv[0] = "sh";
return execvp("sh.exe", argv);
}
// By default, we start the Windows command processor
argv[0] = "cmd";
return execv(get_comspec(), argv);
}
@paul-palmer Thanks!
:+1:
Hi,
I'm trying catch failed provisioner steps but found that opensshd doesn't seem to return exit codes properly for packer to fail.
Windows target:
Linux target:
Anyone else running into the same issue?