hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.42k stars 9.51k forks source link

401 Error with WinRM Connection with NTLM #17829

Open justdan96 opened 6 years ago

justdan96 commented 6 years ago

Terraform Version

v0.11.6

Terraform Configuration Files

TBC

Debug Output

https://gist.github.com/justdan96/5d5ee795ce6236bfc035e2986dcaf6aa

Crash Output

Expected Behavior

Connection should be made using NTLM authentication to the Windows box and the script should be run on the remote server.

Actual Behavior

Receive error "http response error: 401" when trying to authenticate.

Steps to Reproduce

  1. terraform init
  2. terraform apply

Additional Context

The workstation and server are joined to different domains. The terraform script just runs a Powershell script on the remote machine. I am having issues redacting files and sending them from my work PC so I will update once I have figured those out. I tried to reproduce the same sort of issue by running a small Ruby script, but that worked fine. I am attaching the redacted Wireshark outputs for comparison.

ntlm_ruby ntlm_terraform

References

NA

justdan96 commented 6 years ago

Here are the Terraform and Ruby scripts used:

Terraform:

resource "null_resource" "remoting" {
  provisioner "remote-exec" {
    inline = [
      "powershell -file D:\test.ps1 tf-testing123 test terraform",
    ]

    connection {
      type     = "winrm"
      user     = "The.User.Name"
      password = "PASSWORD"
      host     = "MYSERVERWEB0192.mydomain.mycompancorp.com"
      https    = "false"
      insecure = "true"
      use_ntlm = "true"
    }
  }
}

Ruby:

require 'winrm'
opts = {
  endpoint: 'http://MYSERVERWEB0192.mydomain.mycompancorp.com:5985/wsman',
  transport: :negotiate,
  user: 'MYDOMAIN\\The.User.Name',
  password: 'PASSWORD'
}
conn = WinRM::Connection.new(opts)
conn.shell(:cmd) do |shell|
  output = shell.run('powershell -file D:\test.ps1 tf-testing123 test ruby') do |stdout, stderr|
    STDOUT.print stdout
    STDERR.print stderr
  end
  puts "The script exited with exit code #{output.exitcode}"
end

The second screenshot above shows the requests failing. I will post more details once I can redacted them properly.

justdan96 commented 6 years ago

Okay so I think I have figured out what the problem is. There are 2 issues:

  1. Our organisation has group policy settings so that AllowUnencrypted=False. The WinRM library Terraform uses does not support encryption over HTTP. The reason Ruby works is that the Ruby WINRM library supports SPNEGO encryption over HTTP.

  2. Even if a HTTPS listener is enabled for WinRM, only connections with a local account are possible. This is because Azure/go-ntlmssp used by masterzen/winrm does not currently support the domain being passed in. I have submitted a pull request against Azure/go-ntlmssp to try to get this added against the main repo.

I can confirm that use of justdan96/go-ntlmssp resolves this issue.

Should the documentation be updated to include the steps for setting up a machine for WinRM and mentioning that SPNEGO encryption over HTTP isn't supported?

Dogers commented 6 years ago

@justdan96 I've managed to connect to a 2016 server with a domain admin account over HTTPS. It appears that if the server is on a domain, the domain is prepended to the username, so it seems that local users are no longer possible? If you try something like ".\username" it gets sent as "domain\.\username" in the traces I've run. If it's off the domain, then the computer name is prepended instead of the domain. HTTP is failing altogether however - I always get 401 there.

Wish MS would hurry up and release SSH for PowerShell..!

justdan96 commented 6 years ago

@Dogers did you test with the updated dependencies from https://github.com/hashicorp/terraform/pull/17887 ?

Dogers commented 6 years ago

@justdan96 no, just TF 0.11.7 and the AWS plugin 1.14 (I think it is). I've got it joining the domain after provisioning with the local admin, then (at the moment, need to expand it to handle the software that'll be there) leaving the domain before being destroyed.

lapfrank12 commented 6 years ago

hey @justdan96, I've compiled TF with your #17887 hoping to get WinRM with NTLM to work over an HTTPS listener, but I'm still getting the 401 error. Did I misunderstood your above comments stating that it worked after you've done that?

justdan96 commented 6 years ago

Yeah I got it to work correctly. I generated a self signed cert for WinRM. I had to set the Terraform connection configuration to HTTPS, set the HTTPS port, set insecure and use_ntlm to true and specify username as DOMAIN\USERNAME.

You may have issues with the security policy settings on the server, so it is worthwhile checking if there are any funky settings there.

lapfrank12 commented 6 years ago

Thanks justdan96. I'll give it a try again. I was using the exact same options. I don't have much windows background, so I'm not sure what security settings to look at exactly. Would you remember by any chance which ones you ever had issues with? Eventviewer is not giving me any clues or errors at all.

fckbo commented 5 years ago

Hi guys,

I've got exactly the same problem since I joined my win 2016 servers to a domain during provisioning (servers are VM provisioned on vSphere 6.5), I cannot execute remote powershelll scripts on them, here is the section of my template when trying to connect to execute scripts:

  connection {
        type = "winrm"
       user = "MYDOMAIN\\Administrator"
       password = "DOMAIN_ADMIN_PWD"
       host = "${var.ADM_IP}"  # IPv4 address of the virtual machine (=MYSERVER_IP)
        insecure = true
        https = false
        use_ntlm = true
        timeout  = "1m"
   }

I also have the same problem when using:

       user = "Administrator"
       password = "LOCAL_ADMIN_PWD"

I'm getting a 401, eventhough if I test Winrm commands from a Win2016 machine which is not a member of the domain it seems to work fine for exemple executing a remote 'dir' command using:

winrs -r:http://MYSERVER_IP:5985  -u:MYDOMAIN\Administrator -p:DOMAIN_ADMIN_PWD dir
or
winrs -r:MYSERVER_IP  -u:MYDOMAIN\Administrator -p:DOMAIN_ADMIN_PWD dir
or
winrs -r:MYSERVER_IP  -u:Administrator -p:LOCAL_ADMIN_PWD dir
scott1138 commented 5 years ago

I'm seeing the same thing. When I check the Windows Security log I can see the login attempt failing because the username is "SERVER_NAME\USER_NAME" and the Account Domain shows "DOMAIN_NAME". Prior to the domain join the Account Domain was "SERVER_NAME". I can connect directly using Test-WinRM using a PSCredential object that has just "USER_NAME" defined. Any thoughts on this?

bodgit commented 5 years ago

So I think I'm being bitten by this bug in some form. Initially I have basic auth enabled and only an HTTP listener, (enabling HTTPS will no doubt open a can of worms around certificate validation). Terraform initially runs using the AWS-provided credentials and using them joins the machine to the domain, then Terraform tries to use domain credentials to do the remainder of the provisioning and at that point I always get a 401 error.

I've tested using a basic script utilising the Python WinRM client and I can get that to log in using the same domain credentials so I have a set of packet captures that show when it works and when it doesn't. It looks like a lack of SPNEGO support that @justdan96 mentioned in https://github.com/hashicorp/terraform/issues/17829#issuecomment-381646616 as I can see the POST's to /wsman setting the Content-Type header to multipart/encrypted;protocol="application/HTTP-SPNEGO-session-encrypted";boundary="Encrypted Boundary" once credentials have been negotiated. I don't think I'm using Kerberos or CredSSP here as I don't have any of those Python dependencies installed so I think it's just using NTLM SSP which would be enough to inch things forward.

I'm trying to work out exactly which Go library needs fixing to support this, is it Azure/go-ntlmssp or masterzen/winrm?

scott1138 commented 5 years ago

I was able to get around the issue by using basic auth. It seems when NTLM is enabled it tries to use the native domain of the remote server and won't supply a specific domain name. When using basic it seems to always use the computer name. Some improvements around WinRM authentication would be nice.

bodgit commented 5 years ago

It looks like the fix proposed in #17887 is already available in Terraform version 0.12.0 onwards. If I enable WinRM over HTTPS then I can now log in using a domain account with the following:

    connection {
      type     = "winrm"
      host     = "${element(module.server.private_ip, count.index)}"
      port     = 5986
      user     = "EXAMPLE\\account"
      password = "secret"
      use_ntlm = true
      insecure = true
      https    = true
    }

However I now have a bunch of unrelated fallout related to 0.11.0 -> 0.12.0.

irynadiudiuk commented 5 years ago

I was able to get around the issue by using basic auth. It seems when NTLM is enabled it tries to use the native domain of the remote server and won't supply a specific domain name. When using basic it seems to always use the computer name. Some improvements around WinRM authentication would be nice.

Could you please show your working code? My instance is also in the domain. I tried the following on Terraform 0.11.8 and it still doesn't work.

connection { type = "${var.connection_type}" user = "${var.user}" timeout = "15m" https = "false" insecure = "true" use_ntlm = "true" password = "${rsadecrypt(self.password_data, file("/../../my.pem"))}"

davejahahn commented 5 years ago

I'm experiencing this same issue, with 0.12.5

resource "null_resource" "exec_test" {

provisioner "remote-exec" { connection { type = "winrm" host = "MYHOST" port = 5986 user = "MYDOMAIN\MYUSER" password = "MYPASSWORD" use_ntlm = true insecure = true https = true } inline = [ "echo hello!",

]

} }

I can connect fine with Invoke-Command from the same machine that terraform is running this on against the said host and execute commands.

I've tried every permutation of use_ntlm, insecure, and https also.

bodgit commented 5 years ago

@davejahahn Have you escaped the \ correctly in the user field? Otherwise that looks pretty much like what I'm successfully using.

davejahahn commented 5 years ago

@bodgit yes it is escaped, e.g.:, user = "domain\\myuser" - it dropped out the extra when I posted it.

davejahahn commented 5 years ago

@bodgit curious what version of Terraform you are using. Since I can access remotely via Powershell, I know it's not ports being blocked or security configurations for Powershell, so if it is working for you, wondering what the difference might be. Also tried hitting against both 2016 and 2012R2, from a different machine on a different subnet, etc... and the same. The debug log doesn't provide me anything useful at least I can't find anything.

bodgit commented 5 years ago

@davejahahn I used 0.12.1 which was current at the time. What is the output of winrm get winrm/config on one of your hosts?

davejahahn commented 5 years ago

@bodgit I tried an older version and had the same problem, so not sure that is it. What it outputs is this, over and over (with MYHOST and USER swapped out for this post):

module.exec_test.null_resource.exec_test (remote-exec): Connecting to remote host via WinRM... module.exec_test.null_resource.exec_test (remote-exec): Host: MYHOST module.exec_test.null_resource.exec_test (remote-exec): Port: 5986 module.exec_test.null_resource.exec_test (remote-exec): User: DOMAIN\USER module.exec_test.null_resource.exec_test (remote-exec): Password: true module.exec_test.null_resource.exec_test (remote-exec): HTTPS: false module.exec_test.null_resource.exec_test (remote-exec): Insecure: false module.exec_test.null_resource.exec_test (remote-exec): NTLM: true module.exec_test.null_resource.exec_test (remote-exec): CACert: false module.exec_test.null_resource.exec_test: Still creating... [10s elapsed]

bodgit commented 5 years ago

module.exec_test.null_resource.exec_test (remote-exec): Connecting to remote host via WinRM... module.exec_test.null_resource.exec_test (remote-exec): Host: MYHOST module.exec_test.null_resource.exec_test (remote-exec): Port: 5986 module.exec_test.null_resource.exec_test (remote-exec): User: DOMAIN\USER module.exec_test.null_resource.exec_test (remote-exec): Password: true module.exec_test.null_resource.exec_test (remote-exec): HTTPS: false module.exec_test.null_resource.exec_test (remote-exec): Insecure: false module.exec_test.null_resource.exec_test (remote-exec): NTLM: true module.exec_test.null_resource.exec_test (remote-exec): CACert: false module.exec_test.null_resource.exec_test: Still creating... [10s elapsed]

That suggests you're trying to do HTTP (HTTPS: false) to the HTTPS port (Port: 5986) which won't work at all.

davejahahn commented 5 years ago

@bodgit that makes sense and I have tried every combination of the three setting (port, https, and insecure), the one in the post was just the last one I used.

Just to make sure I didn't miss something, I just tried again with these combinations:

port insecure https


5985 true true 5985 true false 5985 false true 5985 false false 5986 true true 5986 true false 5986 false true 5986 false false

And it all has the exact same behavior. e.g. appears that it doesn't connect.

Kind of stumped. The only thing I can think of is how it connects by default when you do an Invoke-Command.

When I run on the same box that I am running the Terraform command, using the same credential and endpoint server, it works, e.g.

$cred = Get-Credential Invoke-Command -ComputerName "MYSERVER" -ScriptBlock { echo "hello!" } -Credential $cred hello!

davejahahn commented 5 years ago

Looking at the documentation for Invoke-Command, the defaults (which are working as above), are: Credential: Default Port: 5985 UseSSL: false

So if it is working with defaults in PowerShell (Port = 5985, useSSL = false, Credential = Default), and it doesn't work in Terraform, it must be the authentication which is causing me grief.

The options for Authentication with Invoke-Command are: Default, Basic, Credssp, Digest, Kerberos, Negotiate, and NegotiateWithImplicitCredential

Terraform documentation says use_ntlm = false uses Basic authentication, so I know I'm not using that since I have it to true. Invoking with PowerShell, Default, Kerberos, and Negotiate all work fine.

What doesn't work is: Digest, Credssp, and NegotiateWithImplicitCredentials. So I guess what it comes down to, when use_ntlm = true, which of these is it using? I will try getting each to work from PowerShell on the box, then see if it works in Terraform to see if it works there.

bodgit commented 5 years ago

The problem is NTLM has lots of negotiated options, almost all of which are likely supported by PowerShell (and maybe other WinRM clients written in other languages), but very few are supported by the Go library used. This means the server can reject your session for reasons other than the username/password being simply wrong.

This is why I asked for the output of winrm get winrm/config on one of your servers, there might be something obvious there.

davejahahn commented 5 years ago

@bodgit sorry I totally misunderstood what you were asking for. Unfortunately I can't post the output of a server configuration publicly due to security requirements at or organization.

Do you know which negotiated options are supported by the Go library that is used? I kind of came to the same conclusion that it is not as robust.

I appreciate your feedback!!

TechnicallyJoe commented 5 years ago

@davejahahn Did you get anywhere with this? - I'm experiencing the exact issue you are.

Update:

With a null_resource i tried executing the provisioner again (So i could test) and turns out that i needed to run these 2 commands:

winrm set winrm/config/service/auth '@{Basic="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="true"

Then i could connect successfully. Now i just need to figure out how i do this when the VM is created. Any ideas are welcome!

Update 2:

So, i just found this module: https://github.com/innovationnorway/terraform-azurerm-vm-run-command

It appears to allow me to run commands on a VM. Underneith it seems to use the azure vm extension, which im sure you can also use. It uses the Azure Agent, if im not mistaken.

I rolled back my Basic and allowunencrypted settings before trying it out and it still worked :)

Hope it helps!

andrew-sumner commented 2 years ago

Are we any closer to a solution to using WinRM from terraform over HTTP when target server is domain joined and configured with AllowUnencrypted="false"?