audiolize / vagrant-softlayer

This is a Vagrant plugin that adds a SoftLayer provider to Vagrant, allowing Vagrant to control and provision SoftLayer CCI instances.
MIT License
42 stars 15 forks source link

Configuration of SL_WIN_LATEST_64 is a mystery #24

Closed lonniev closed 10 years ago

lonniev commented 10 years ago

The description of the Windows boxes at https://vagrantcloud.com/ju2wheels does not provide enough clues to deduce how to talk with the Windows VM once SL has instantiated it.

I would be helped if the description of the box at vagrantcloud stated what core services and authentication means are offered in the box, particularly if the box or the provider don't offer password-less sudo vagrant/vagrant and the insecure vagrant key pair tactics.

Can you give me a bit more newbie advice on how to vagrant up the Windows boxes at ju2wheels with the softlayer provider?

lonniev commented 10 years ago

The password reveal is available on the Device Lists page: it is hidden in the dropdown that is revealed by clicking the disabled-looking |> to the left of the device's name. Very counterintuitive.

It still will help others if the description on the boxes at vagrantcloud states which administration interfaces are baked into each box.

lonniev commented 10 years ago

The ju2wheels/SLWIN* boxes need a wee bit more prep work:

Currently, I can use winrm on the host (MacOS) side to vagrant up and provision the SL Windows instance and I can vagrant ssh into the vagrant account. What I don't have working is rsync from the host to the guest of the crucial ./ directories into /vagrant -- and as I write that I wonder where that unixy /vagrant directory will be created or should exist on the Windows guest.

This is the error of the moment:

Rsyncing folder: /Users/lonniev/Vagrants/softlayer-windows-jazzclm/ => /vagrant                                    
/Users/lonniev/.vagrant.d/gems/gems/vagrant-softlayer-0.3.2/lib/vagrant-softlayer/action/sync_folders.rb:89:in `block in call': uninitialized constant VagrantPlugins::SoftLayer::Errors::RsyncError (NameError)

(If I am reinventing wheels by trying to automate this, if there is an SL box that already has this all pre-provisioned, just point that out and I'll gladly give up this homework.)

lonniev commented 10 years ago

The cwRsync app on Windows uses the cygwin.dll for unixy I/O in a DOS world, so it is going to want paths like "/cygdrive/c/..." to access outside the home directory. The following line in the Vagrantfile resolves the above RsyncError and allows chef-solo to get all the cookbook and environment/role files it wants.

  config.vm.synced_folder ".", "/cygdrive/c/vagrant", type: "rsync", rsync__exclude: [ "Vagrantfile", ".git/" ]
ju2wheels commented 10 years ago

A few things on the boxes:

If you need some examples, lets reference existing veewee and vagrant-windows work to cannibalize into a standalone Powershell scripts that works across all the versions. Ill start looking into this as well.

lonniev commented 10 years ago

I agree that having to mix in unix apps (rsync, ssh) to a Windows box smells unnatural. I like your “post_install” suggestion—but I haven’t used or learned it yet. I’ll read up on what post_install might offer.

What about a collection of boxes that have the vagrant/vagrant account—with a “remove vagrant” recipe? That would make these Windows deployments more admin-friendly for those of us who just wish Windows would go away?

If I tweak up a generic SL box with cwRsync, sshd, a vagrant account, and the vagrant.pub in %USER%.ssh, what’s the best brief gist on how to snapshot that modified box into a new box on vagrantcloud?

On Sun, Aug 10, 2014 at 2:17 PM, Julio Lajara notifications@github.com wrote:

A few things on the boxes:

  • Ill update the default admin users to the stock OS templates, but if we do sshd/winrm I wouldnt add them as part of stock OS template boxes and would do them as post_install Powershell scripts as part of contrib for each service as doing both services seems redundant and customers will probably choose only one or the other.
  • I feel very reluctant to add the vagrant account and insecure key to cloud boxes for security reasons by default. In my environment we block access to boxes with a firewall so they cant be reached until rules are explicitly opened, but if a customers environment isnt setup this way then doing this leaves a security gap regardless of how short lived if its publicly accessible. I would rather leave this up to the end user to do if thats what they want and leave the boxes with the stock SoftLayer admin users (root/Administrator).
  • Finding a secure way to add the ability to get into the Windows boxes with ssh key or winrm would however be something worth generalizing so they can be used like the Linux boxes but im not sure what the best option is there.
  • Not sure whether I will add that rsync as part of box yet or just document it, need to use it in action first to figure it if thats use case specific or not.

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-51725861 .

lonniev commented 10 years ago

My thought about the rsync faulure was malarky. The real issue was that rsync's ssh session wasn't authenticating. The line above actually just creates a directory "c:\cygdrive\c\vagrant" with a copy of the local current working directory within it. It is totally useless because the later chef specs for environments, roles, and cookbooks do the proper rsync into c:\users\vagrant..

cwRsync doesn't need "/cygdrive/c" as the root of a path to "C:\".

lonniev commented 10 years ago

Is it correct that the SL provider calls the post_install script (that can be pulled in from external URI) after installation of the SL image and before the provisioning phase (e.g. chef-solo) is run?

If so, I need to craft a script that adds the vagrant user, obtains and installs the rsync and sshd apps, copies vagrant's keys, and configures winrm, rsync, and sshd. That doesn't seem to be too impractical.

lonniev commented 10 years ago

except for pulling in rsync and sshd, this is the gist of the sought script:

# get a reference to the local OS configurator
$computer = [ADSI]"WinNT://."

# create the vagrant user with password vagrant
$user = $computer.Create("User","vagrant")
$user.setpassword("vagrant")
$user.put("Fullname", "Vagrant User")
$user.SetInfo()

# ADS_UF_DONT_EXPIRE_PASSWD flag is 0x10000
$user.UserFlags[0] = $user.UserFlags[0] -bor 0x10000
$user.SetInfo()

# add the users created to be added to the local administrators group.
net localgroup Administrators /add "vagrant"

# configure WinRM

winrm quickconfig

Set-Item WSMAN:\LocalHost\MaxTimeoutms -Value "1800000"
Set-Item WSMAN:\LocalHost\Client\AllowUnencrypted -Value $true
Set-Item WSMAN:\LocalHost\Client\Auth\Basic -Value $true

Set-Service WinRM -startuptype "automatic"
Start-Service WinRM
lonniev commented 10 years ago

You can see the entire script at https://github.com/lonniev/softlayer-windows-jazzclm/tree/master/post_install/windows

Apparently, post_install can handle either .bat or .ps1 files but can only run the powershell ones if the execution policy is set to unrestricted (or the admin goes through the trouble of signing the scripts with a cert). Therefore, to run the powershell script, I have to wrap it in a batch script that says, no, really, just run the powershell file.

I'm still debugging the powershell syntax and the sequencing of the script but it's a nearly there solution.

ju2wheels commented 10 years ago

Is that working without issues for you using WinRM against the vagrant account? Curious if you have run into any of the issues described here as I hit one of the errors shown here but havent yet determined if its due to my environment's use of GPO yet or not.

lonniev commented 10 years ago

@ju2wheels is there a post_install protocol expectation that the script, if it exists, should conclude by requesting an OS restart? How does the SL API otherwise see the state change from not ready to running?

lonniev commented 10 years ago

When I set up WinRM on the instance by RDPing to the VM using the Administrator account and the randomized password, I can then return to the host and run "vagrant provision" using a winrm communicator. That works.

It is, however, not a proper solution because the goal is to run "vagrant up --provider softlayer" once and yield an up and running, fully provisioned server.

If I can get through the trial and error of getting the powershell post_install to work, I'll have the VM correctly evolved to have both ssh and winrm communications and a vagrant/vagrant password-free administrator account.

Then, I can return to the point of the exercise: provisioning the desired server apps for the Windows server.

I'm nearly there. The only delay is that I keep typing in chaos monkey typos into my scripts and then getting to wait 20 minutes for the post_install(er) to timeout. (How about making that timeout(1200) parameterized?)

lonniev commented 10 years ago

I'm going to conclude my post_install with an explicit "shutdown /r" command. I can't find documentation that says this is required but it sure seems to be what the host-side SL provider code is expecting. It won't hurt to try it in one of these 20-minute cycles. ;-)

ju2wheels commented 10 years ago

The timeout is parameterized in one of the recent releases, either dev or the pending the pull request but not in the current stable I think.

ju2wheels commented 10 years ago

The generic script will have to install Windows Management Framework update on pre Win2008 R2 systems in order to avoid having to create automation and fallout from having to manage the Winrm support matrix nightmare.

After looking at it some more Im also not going to split it out as originally thought, as it does make more sense to provide rsync funcationality with winrm even the customer doesnt intend to use winrm. Instead of having a standalone rsync agent and cygwin/sshd we will just drop in a minimal sshd.

lonniev commented 10 years ago

Getting a windows sshd onto the image during the post_install phase is a real chore: most of the sshd apps are lousy and either their download sites or their installers demand interactive entry. Bitvise WinSshd doesn’t require interactivity but one has to feed it a “here” file to get it to synch authorized_keys and to allow password-less logins. Its messy. MobaSSHD was promising because it bundles wget, chown, chmod, and rsync along with sshd. However, it requires a GUI installer. I futzed with WASP Select-Window | Select-Control | Send-Click but couldn’t get the timing right.

Also a pain is getting Windows to use an existing directory as the user’s home and profile path (two separate folders that are typically unioned). If you try to create a homedir with an existing .ssh within it, Windows puts the profile folder in ~user.DOMAIN.

If the box image is manually prepped with an sshd client, rsync, and the vagrant user—all installed and created with Windows GUI apps—then the resulting vagrant box is much, much easier to use.

Let me know in my morning (about 6 hours from now) if you spin a new ju2wheels/SL_WIN_LATEST_64 with vagrant, sshd, and rsync in it.

Thanks.

On Tue, Aug 12, 2014 at 9:41 PM, Julio Lajara notifications@github.com wrote:

The generic script will have to install Windows Management Framework update http://support.microsoft.com/kb/968930 on pre Win2012 systems in order to avoid having to create automation and fallout from having to manage the Winrm support matrix nightmare http://technet.microsoft.com/en-us/library/ff520073(WS.10).aspx.

After looking at it some more Im also not going to split it out as originally thought, as it does make more sense to provide rsync funcationality with winrm even the customer doesnt intend to use winrm. Instead of having a standalone rsync agent and cygwin/sshd we will just drop in a minimal sshd.

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-52006535 .

lonniev commented 10 years ago

Just found a trick to force the OS to create the user directory along with the profile path without having to exit the post_install and have that user login.

That is: http://timrayburn.net/blog/start-a-process-as-another-user-in-powershell/

One can “net user vagrant vagrant /add” and then use the above tactic to create the user directory. Afterwards, the .ssh directory can be created and populated with the .ssh/authorized_keys/vagrant public key.

What a pain. ;-)

On Tue, Aug 12, 2014 at 11:50 PM, Lonnie VanZandt lonniev@gmail.com wrote:

Getting a windows sshd onto the image during the post_install phase is a real chore: most of the sshd apps are lousy and either their download sites or their installers demand interactive entry. Bitvise WinSshd doesn’t require interactivity but one has to feed it a “here” file to get it to synch authorized_keys and to allow password-less logins. Its messy. MobaSSHD was promising because it bundles wget, chown, chmod, and rsync along with sshd. However, it requires a GUI installer. I futzed with WASP Select-Window | Select-Control | Send-Click but couldn’t get the timing right.

Also a pain is getting Windows to use an existing directory as the user’s home and profile path (two separate folders that are typically unioned). If you try to create a homedir with an existing .ssh within it, Windows puts the profile folder in ~user.DOMAIN.

If the box image is manually prepped with an sshd client, rsync, and the vagrant user—all installed and created with Windows GUI apps—then the resulting vagrant box is much, much easier to use.

Let me know in my morning (about 6 hours from now) if you spin a new ju2wheels/SL_WIN_LATEST_64 with vagrant, sshd, and rsync in it.

Thanks.

On Tue, Aug 12, 2014 at 9:41 PM, Julio Lajara notifications@github.com wrote:

The generic script will have to install Windows Management Framework update http://support.microsoft.com/kb/968930 on pre Win2012 systems in order to avoid having to create automation and fallout from having to manage the Winrm support matrix nightmare http://technet.microsoft.com/en-us/library/ff520073(WS.10).aspx.

After looking at it some more Im also not going to split it out as originally thought, as it does make more sense to provide rsync funcationality with winrm even the customer doesnt intend to use winrm. Instead of having a standalone rsync agent and cygwin/sshd we will just drop in a minimal sshd.

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-52006535 .

ju2wheels commented 10 years ago

The api_timeout is related to the calls directly to the API, the one you are probably more interested in raising is provision_timeout (default 20 minute) which has already been committed to the develop branch. My provision script didnt seem to need a reboot, but yes its definitely taking on the order of 45min-1hr (portal reports estimated time to complete for Win 2012 STD w/4gb RAM at 71min for me) to build a windows machine so you will have to increase that timeout.

lonniev commented 10 years ago

ok. yes, it would be the provision_timeout.

I wonder if we could modify that wait loop to spit out any intermediate state changes that are exposed through the SL api? 60+ minutes is a long time to stare at a shell script wondering if it is hung or is just waiting because it hasn’t remarked since it last said, “this might take a Few minutes”. ;-)

On Wed, Aug 13, 2014 at 10:25 AM, Julio Lajara notifications@github.com wrote:

The api_timeout is related to the calls directly to the API, the one you are probably more interested in raising is provision_timeout (default 20 minute) which has already been committed to the develop branch. My provision script didnt seem to need a reboot, but yes its definitely taking on the order of 45min-1hr (portal reports estimated time to complete for Win 2012 STD w/4gb RAM at 71min for me) to build a windows machine so you will have to increase that timeout.

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-52072917 .

ju2wheels commented 10 years ago

it will currently output every 10 seconds that its not done yet if logging is set to debug but not the actual state steps shown on portal.

lonniev commented 10 years ago

starvation or forced feeding. ;-) Not only would all the unnecessary logging appear but I would get some 360+ updates will waiting. I would say once every "few" minutes a message like "still waiting for the Running state, state is currently Foo. Having waited xx minutes, I'll give up in 20-xx minutes from now" would be soothing.

lonniev commented 10 years ago

Trying to offer ssh as a communicator for vagrant provisioning is a goose chase: after getting an sshd server in place, then an rsync command, it then wants a bash shell. That bash shell has to return $?==0 for printf $SSH_AUTH_SOCK, and so it goes. It is presuming that ssh present implies a proper unix environment.

I will leave my post_install with enough of ssh present that one can vagrant ssh into the cmd shell. That is useful by itself.

I then fixed a (nother) typo in the script on the winrm configuration. I now have winrm working as a communicator.

So I have reached the goal of taking a provided ju2wheels box for SL and Windows and post installing enough ssh and winrm configuration for doing business provisioning of the VM with chef-solo. Whew.

ju2wheels commented 10 years ago

@lonniev have you experienced an abnormal high rate of post provision hook script failures (it fails to download the post_install script) at random like every 2 to 3 provisions and then works fine on rebuild?

lonniev commented 10 years ago

The last few days have been rough like that. However, I am trying to debug why powershell works in one environment and then not when called remotely and I’m off in the jungle of weird cmdlets, policies, and registry hacks. Along the way, I keep making unix-not-windows typos. So, I blame most of the strange behavior on my ignorance.

I use “iwr”, Windows wget app, to pull web resources into the box. Yes, several times yesterday the iwrs would time out and then suddenly work on a retry.

On Thu, Aug 14, 2014 at 6:30 PM, Julio Lajara notifications@github.com wrote:

@lonniev https://github.com/lonniev have you experienced an abnormal high rate of post provision hook script failures (it fails to download the post_install script) at random like every 2 to 3 provisions and then works fine on rebuild?

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-52262290 .

ju2wheels commented 10 years ago

Ive come up with a way to get us passwordless ssh but these long build times are slowing dev ;-( . Aiming for sometime next week to have generalized scripts at least for WinRM that works across all the Win versions.

FYI use of iwr is not portable. Check out the change in thist gist: https://gist.github.com/ju2wheels/d4d4a767c535977b231c

The current plan (#29):

  1. Provide generalized scripts and instructions on creating custom post_install to select components wanted for the following services:
  2. Windows Management Framework Normalization (brings older Win variants up to WinRM 2.0/Powershell 2.0, will be required for WinRM enablement to simplify automation due to the number of versions)
  3. WinRM 2.0 w/HTTP (optional flags for AD cert based HTTPS and self signed HTTPS, have scripts and idea but not sure yet if it self signed will work in the end)
  4. Cygwin (setup of Cygwin with cyg-apt for post build package management and optional flag for Cygwin Ports enablement and added package enablement)
  5. vagrant-softlayer will be enhanced with an option to append selected SSH keys to API user_data and a post provision script will take this and config Cygwin ssh for the Admin user only.
  6. provide the scripts for creating vagrant user for standard vagrant box but do not include it in the default post_install scripts, user will have to create their own and pull it in themselves and assume responsibility for shooting themselves in the foot security wise.
  7. The above creates a "pluggable" framework for post_install based on your bat script.
  8. It allows for the addition of alternative process scripts to be pluggable as well (ie pulling scripts from vagrant-softlayer followed by custom stuff like pulling internal scripts from private network to change admin password.

In the end this should allow us a flexible means to do passwordless ssh and reset of WinRM password to something non random allowing better out of the box usage of ssh and WinRM.

lonniev commented 10 years ago

I like the concept—except for (4). Cygwin is very useful once it is in place but it has a nasty installer. If you have a way to make including and maintaining it easy, ok. Having it there makes life easier for those of us used to and preferring unix admin.

I was considering setting aside the SL provider to work with a local virtual box Windows image to resolve the unexpected challenges with doing relatively minor things (like “sudo vagrant mkdir -p ~/.ssh/authorized_keys”) in bat, powershell, remotely, with Windows security policies.

The overhead of waiting for SL to bring up a new image kills the trial-and-error process.

On Fri, Aug 15, 2014 at 10:43 AM, Julio Lajara notifications@github.com wrote:

Ive come up with a way to get us passwordless ssh but these long build times are slowing dev ;-( . Aiming for sometime next week to have generalized scripts at least for WinRM that works across all the Win versions.

FYI use of iwr is not portable. Check out the change in thist gist: https://gist.github.com/ju2wheels/d4d4a767c535977b231c

The current plan:

  1. Provide generalized scripts and instructions on creating custom post_install to select components wanted for the following services:
  2. Windows Management Framework Normalization (brings older Win variants up to WinRM 2.0/Powershell 2.0, will be required for WinRM enablement to simplify automation due to the number of versions)
  3. WinRM 2.0 w/HTTP (optional flags for AD cert based HTTPS and self signed HTTPS, have scripts and idea but not sure yet if it self signed will work in the end)
  4. Cygwin (setup of Cygwin with cyg-apt for post build package management and optional flag for Cygwin Ports enablement and added package enablement)
  5. vagrant-softlayer will be enhanced with an option to append selected SSH keys to API user_data and a post provision script will take this and config Cygwin ssh for the Admin user only.
  6. provide the scripts for creating vagrant user for standard vagrant box but do not include it in the default post_install scripts, user will have to create their own and pull it in themselves and assume responsibility for shooting themselves in the foot security wise.
  7. The above creates a "pluggable" framework for post_install based on your bat script.
  8. It allows for the addition of alternative process scripts to be pluggable as well (ie pulling scripts from vagrant-softlayer followed by custom stuff like pulling internal scripts from private network to change admin password.

In the end this should allow us a flexible means to do passwordless ssh and reset of WinRM password to something non random allowing better out of the box usage of ssh and WinRM.

— Reply to this email directly or view it on GitHub https://github.com/audiolize/vagrant-softlayer/issues/24#issuecomment-52329170 .

ju2wheels commented 10 years ago

Thats why cyg-apt is included, to make package install easier. I havent used it myself but its effectively apt like interface.

http://stackoverflow.com/questions/9751845/apt-get-for-cygwin