hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.42k stars 9.5k forks source link

remote-exec ssh Not supporting UserKnownHostsFile / StrictHostKeyChecking #16938

Closed omerlin closed 6 years ago

omerlin commented 6 years ago

Terraform Version

[opc@instance1 oci]$ terraform -v
Terraform v0.11.1

Terraform Configuration Files

resource "null_resource" "remote-exec" {
    depends_on = ["oci_core_instance.CAS","oci_core_volume_attachment.VOLCASATT"]

    provisioner "remote-exec" {
      connection {
        type = "ssh"
        agent = false
        timeout = "30s"
        host = "${data.oci_core_vnic.InstanceVnic.private_ip_address}"
        user = "opc"
        private_key = "${file(var.ssh_private_key)}"
    }
      inline = [
        "touch ~/IMadeAFile.Right.Here",
        "sudo iscsiadm -m node -o new -T ${oci_core_volume_attachment.VOLCASATT.iqn} -p ${oci_core_volume_attachment.VOLCASATT.ipv4}:${oci_core_volume_attachment.VOLCASATT.port}",
        "sudo iscsiadm -m node -o update -T ${oci_core_volume_attachment.VOLCASATT.iqn} -n node.startup -v automatic",
        "echo sudo iscsiadm -m node -T ${oci_core_volume_attachment.VOLCASATT.iqn} -p ${oci_core_volume_attachment.VOLCASATT.ipv4}:${oci_core_volume_attachment.VOLCASATT.port} -l >> ~/.bashrc"
      ]
    }
}

Debug Output

2017-12-18T13:31:13.798Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:13 connecting to TCP connection for SSH 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 connection error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 [WARN] retryable error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 [INFO] sleeping for 1s 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "root", waiting for: "provider.null (close)" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "provider.null (close)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "meta.count-boundary (count boundary fixup)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 connecting to TCP connection for SSH 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 connection error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 [WARN] retryable error: dial tcp 10.0.0.11:22: getsockopt: connection refused

Crash Output

Expected Behavior

The remote-exec fails. But playing the "same command": ssh -i /tmp/oci_odc_admin 10.0.0.11 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no It works. Without the options to ignore HostChecking, there is a prompt because this is the first connection. I guess the problem is related to this prompting not managed in Go ssh code. (communicator.go of the ssh package)

Actual Behavior

Steps to Reproduce

terraform init terraform apply

Important Factoids

References

jbardin commented 6 years ago

Hi @omerlin,

Terraform doesn't use the openssh binaries, so the openssh config files and options aren't something we can support in full. However, at least in this example I don't think the failure has anything to do with UserKnownHostsFile or StrictHostKeyChecking, because the error is simply connection refused.

Can you verify the host and port that ssh connects to from the cli? My first guess is that there is either something host specific in the ssh config changing the port, or a proxy/bastion configuration of some sort.

omerlin commented 6 years ago

To be honest, I knew Terraform is not supporting these ssh extra options, even if ansible has such option. Even if Go language ssh code is not ssh client based, the sshd server is expecting a host confirmation at first connection. I agree the message Connection refused is more a symptom of bad host or port... But I have done the test many times and always did the ssh cli test with a copy paste of the terraform logs. There is no bastion nor proxy on this brand new Vlan, only a private network in range 10.0.0.0/16 I have I think an alternative, which is using a userdata bootstrap script, but I have entered anyway this issue as I think remote-exec is an important Terraform feature

omerlin commented 6 years ago

https://groups.google.com/forum/m/#!msg/golang-nuts/1zfmRLpfn_w/W0ySGaliBQAJ Seems similar to my problem

jbardin commented 6 years ago

I'm not sure what you mean by "the sshd server is expecting a host confirmation"; do you have some host-based auth also configured on the server?

Even if there is something that might reject the host in the server config, getsockopt: connection refused means you couldn't create a TCP connection, which means that we never got to the ssh protocol at all. I agree that golang-nuts post sounds very similar, and may have the same root cause, but neither of these cases are failing due to ssh StrictHostKeyChecking, because there is no TCP connection.

Can you run your ssh example with debug output to see exactly what it's doing?

danvaida commented 6 years ago

@jbardin would you suggest opening a separate issue for supporting options such as StrictHostKeyChecking?

jbardin commented 6 years ago

Closing this as there's been no update in a while. If there is still an issue, feel free to reply, and we can open it again.

While it's not related to the symptoms described here, it is related to the issue title -- the next release will support ssh host key verification via #17354

mboudreau commented 6 years ago

@jbardin I got a similar issue to this, where my known_hosts on my computer already had a box that I had connected to in the past which it's IP got recycled. When I tried to do a terraform apply, it just kept 'creating' at the file provisioning stage and waiting until it fully timed out or an interrupt happens.

First, I'd say that the messaging isn't great since I wasted an hour just figuring out why I couldn't apply to this instance, it just seemed stuck, and second, it would be great to have an option where you can specifically mention to ignore strict host checking.

jbardin commented 6 years ago

@mboudreau,

Terraform does not use the system ssh binaries and does not read the known_hosts file. If terraform is unable to connect to the remote host, it's unrelated to that file or any configuration options in your ssh config.

mboudreau commented 6 years ago

@jbardin yep you're right, I jumped the gun too quickly on that. I still can't figure out why rsync seems to just hang and then disconnect when doing file provisioning. I can't even get it working with rsync directly, which is weird...

ghost commented 4 years ago

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.