How to support Microsoft workloads in plumbery?

bernard357 commented 8 years ago

At the moment plumbery is able to establish ssh connections with Linux, and to deal with cloud-init to contextualise these nodes. There is a need to achieve similar capability with workloads based on Windows operating system. How to connect with Windows nodes remotely? How to act on these nodes?

First of all, there is a software equivalent to cloud-init for Windows, that is named cloudbase-init http://cloudbase-init.readthedocs.org/en/latest/

This could be used in fittings plan with a new directive, e.g., cloudbase-init: and plumbery would create a file to be sent to remote nodes. Then there is a question on how could such file be transmitted to remote nodes.

WinRM and Powershell could be an option for this: https://dunniganp.wordpress.com/2014/06/13/using-winrm-and-powershell-to-write-files-on-windows/

This is early thinking, so any contribution on the topic is welcome

bernard357 commented 8 years ago

Another interesting post: https://cloudbase.it/windows-without-passwords-in-openstack/

bernard357 commented 8 years ago

There is a concrete use case for the orchestration of multiple Windows servers in France. One domain controller, one file server, one application server, to be deployed and configured after deployment. Could this be the appropriate trigger for some Windows expert to jump in? Thanks in advance

tonybaloney commented 8 years ago

OK, so I've done my research.

The ultimate goal is to have a proper configuration-management agent on the machine, like a salt minion or a chef client.

Our default images could have WinRM, but WinRM (in Microsoft's genius) doesn't allow Remote Management by default. That's right, Windows Remote Management, does not allow remote management by default.

So a better option is to use RDP as a remote execution engine since we know it works reliably cross-platform. BUT, we want this thing to run inside a Docker container, which will be based on a Linux subsystem. There is an executable called winexe, so I found some code to wrap around that in Python and published a package to PyPi.

https://pypi.python.org/pypi?name=pywinexe&version=1.0.0&:action=display

So the eventual (once I finish) implementation will be:

Offer generic Windows commands inside the node configuration (including powershell)
Offer a polisher that bootstraps Salt or Chef
Support Salt states or Chef recipes within the fittings file

CC @bernard357 @asimkhawaja @tintoy

tintoy commented 8 years ago

@tonybaloney hey let's talk about this tomorrow, I have some ideas regarding WinRM...

They do aim for secure-by-default but enablement of remote access does survive sysprep.

tintoy commented 8 years ago

(at least I'm pretty sure it does)

tintoy commented 8 years ago

Just did a quick experiment:

Created a new VM from default w2k12 image.
Verified that I can't connect to the VM using Powershell remoting (i.e. Enter-PSSession).
Signed into VM and ran Enable-PSRemoting -Force.
Verified that I can connect to the VM using Powershell remoting (i.e. Enter-PSSession).
Cloned the VM to create a new image.
Deployed the new image to create a new VM.
Verified that I can connect to the new VM using Powershell remoting (i.e. Enter-PSSession).

So it looks like that does persist after cloning / sysprep.

tonybaloney commented 8 years ago

I came to similar conclusions. But the problem is that you need to automate that change in the first place, so by that stage it's just as easy to install a proper remote execution agent. I don't want to leave a security hole on every machine we deploy

tintoy commented 8 years ago

@tonybaloney Actually, I did some further testing. PS remoting is already enabled in the base w2k12 image! So it's possible right out of the gate (no additional configuration required, just connect using New-PSSession).

tintoy commented 8 years ago

BTW, I have to ask - how is a remote execution engine more secure than Powershell remoting? I mean, they have a whole team of people to ensure that it's secure, no? ;-)

tonybaloney commented 8 years ago

They do, I was reviewing the scripts that Ansible, SaltStack use to bootstrap machines, they make a series of changes that I don't understand the repercussions of, but they look bad! RDP is 96-bit TLS by default, WinRM can use HTTPS but you then get into the realm of certificate management, trust, rotation, yuk. We can probably support both approaches, but I need to refactor some code first.

anthonylangsworth commented 8 years ago

Let's talk a bit about the security settings here. I am not sure what you mean by 96-bit TLS and we should enable HTTPS by default for WinRM, even if we use self-signed certificates. We need to be secure by default.

tonybaloney commented 8 years ago

This is how ansible does it. I don't know enough about windows security policy to judge if this is good or bad. But it seems overly complex https://github.com/ansible/ansible/blob/devel/examples/scripts/ConfigureRemotingForAnsible.ps1

tintoy commented 8 years ago

Hmm - sounds like they have problems handing CNG certificates...

tonybaloney commented 8 years ago

Another example (from Microsoft) https://blogs.technet.microsoft.com/heyscriptingguy/2015/10/27/using-winrm-on-linux/ That sounds bad - "winrm set winrm/config/service @{AllowUnencrypted="true"}"

anthonylangsworth commented 8 years ago

Regarding the Ansible script, it configures WinRM to use basic authentication over SSL with a self-signed SSL server certificate. Ideally we would want mutual certificate authentication between server and client but what they have is OK. Regarding the SSL server certificate, It uses a 1024-bit RSA key which is too small. It should be 2048-bits or greater. It also omits the hash algorithm, meaning it defaults to the now deprecated SHA1 if I recall correctly. It should use SHA 256 or better. The script makes some assumptions around the computer name used in the certificate subject but that is understandable. It also repeats the code to create a new certificate around lines 129-152 and 159-186 but that is a code hygiene issue and not a security issue. That said, it is a lot better than similar configuration scripts I have seen.

Regarding setting AllowUnencrypted="true", this is bad. See https://blogs.msdn.microsoft.com/powershell/2015/10/27/compromising-yourself-with-winrms-allowunencrypted-true/ for a better explanation that I have time to write.

tonybaloney commented 8 years ago

code speaks a 1000 words https://github.com/DimensionDataCBUSydney/plumbery/commit/712d425d55de4924249ee20a60d8c3914b541091

I want to replace the Ansible line with our own best-practice PowerShell script. But you probably see what I'm trying to do..

Try and connect over WinRM, if it doesn't work (which it probably won't), use winexe (a linux version of psexec) to connect and enable it.

anthonylangsworth commented 8 years ago

The challenge with using winexe, assuming it works like psexec, is that it sends credentials in clear text, although there is a work around. See https://digital-forensics.sans.org/blog/2010/06/01/protecting-admin-passwords-remote-response-forensics/ for more information.

cwkendall commented 8 years ago

I've done a little bit of tinkering on this in the last few days to provision win2012 images using plumbery.
I ended up modifying the windows polisher in 1.0.0 to use impacket (https://github.com/CoreSecurity/impacket/) instead of winexe. Main advantages are that Impacket is pure python based and in PyPi.

Impacket uses NTLM auth on top of SMB (port 445 or 139) to open up a remote shell and I've been using it to bootstrap the WinRM code that tony prepared earlier. However I'd like to move away from using unencrypted WinRM. Looking at how cloudbase-init (https://github.com/openstack/cloudbase-init) does it, the preferred solution would be to pre-provision a certificate over HTTPS via the MCP API at deployment time, which would mean that WinRM could be pre-configured with HTTPS support.

Until that time, I was thinking maybe SMBClient over encrypted SMB3 may be the way to go, but I haven't tested it yet. Can anyone think of issues doing it that way?

SMBClient also works quite similar to SSH, which means that the 'prepare:' block used for SSH Deployments could be adapted for windows hosts.

I also noticed that ScriptDeployment and FileDeployment interfaces are actually libcloud primitives. Would it make more sense to add the windows deployment capabilities into libcloud (e.g. using SMBClient rather than SSH) ?

tonybaloney commented 8 years ago

sweet. That get's rid of the winexe IPv6 issue!

wrt to the certificate, I had issues with certificates on the image that I was testing with (they were broken out of the box until you open IE by hand, weird).

I really hate WinRM, its overly complex, its (seemingly) unreliable from any other platforms than calling from PowerShell and the reliance on SSL certs is stupid when a simple transport layer encryption would have sufficed (like RDP).

The WinRM config tool is also really unreliable, you create this dictionary on the machine and edit keys using a command line, it doesn't support PowerShell escaping so you need to run it via cmd.exe.

Adding SMBClient could certainly be a good idea to libcloud, but most people don't deploy windows using automation, one can't imagine why...

anthonylangsworth commented 8 years ago

The downside is requiring the CIFS (tcp/445) or SMB (tcp/139) ports to be Internet accessible, which is generally a bad idea. However, short of requiring access to a CA to issue TLS server certificates for WinRM, I cannot think of a better solution.

tonybaloney commented 8 years ago

We can use Adam's "sesame" project to port IPv6 access temporarily so at least the client doesn't need to NAT anything. I can't see any other option (other than changing our base images to have WinRM setup properly by default)

anthonylangsworth commented 8 years ago

Good idea. Opening the ports temporarily to the plumbery server's IP address (only) for a short time is acceptable from a security perspective.

tintoy commented 8 years ago

I'll have to extend it to do IPv6 but that shouldn't be hard.

bernard357 commented 7 years ago

guys, any concrete contribution or PR on this one? Thanks

tintoy commented 7 years ago

Sorry, haven't had time to look into this, (and may not before January) but I do know how to do it. We can use libcloud to:

Disable the network domain's default "disallow all externally-originating ipv6" rule.
Create a firewall rule permitting access from the client's IPv6 address (whatever that may be, as long as it's internet-routable).
Do what needs to be done.
Revert 1 and 2.

bernard357 commented 7 years ago

thanks @tintoy - this will be changed to [wip] when you can start working on it :-)

NTTLimitedRD / plumbery

How to support Microsoft workloads in plumbery? #18