Closed omerlin closed 6 years ago
Hi @omerlin,
Terraform doesn't use the openssh binaries, so the openssh config files and options aren't something we can support in full. However, at least in this example I don't think the failure has anything to do with UserKnownHostsFile or StrictHostKeyChecking, because the error is simply connection refused
.
Can you verify the host and port that ssh
connects to from the cli? My first guess is that there is either something host specific in the ssh config changing the port, or a proxy/bastion configuration of some sort.
To be honest, I knew Terraform is not supporting these ssh extra options, even if ansible has such option. Even if Go language ssh code is not ssh client based, the sshd server is expecting a host confirmation at first connection. I agree the message Connection refused is more a symptom of bad host or port... But I have done the test many times and always did the ssh cli test with a copy paste of the terraform logs. There is no bastion nor proxy on this brand new Vlan, only a private network in range 10.0.0.0/16 I have I think an alternative, which is using a userdata bootstrap script, but I have entered anyway this issue as I think remote-exec is an important Terraform feature
https://groups.google.com/forum/m/#!msg/golang-nuts/1zfmRLpfn_w/W0ySGaliBQAJ Seems similar to my problem
I'm not sure what you mean by "the sshd server is expecting a host confirmation"; do you have some host-based auth also configured on the server?
Even if there is something that might reject the host in the server config, getsockopt: connection refused
means you couldn't create a TCP connection, which means that we never got to the ssh protocol at all. I agree that golang-nuts post sounds very similar, and may have the same root cause, but neither of these cases are failing due to ssh StrictHostKeyChecking, because there is no TCP connection.
Can you run your ssh example with debug output to see exactly what it's doing?
@jbardin would you suggest opening a separate issue for supporting options such as StrictHostKeyChecking
?
Closing this as there's been no update in a while. If there is still an issue, feel free to reply, and we can open it again.
While it's not related to the symptoms described here, it is related to the issue title -- the next release will support ssh host key verification via #17354
@jbardin I got a similar issue to this, where my known_hosts on my computer already had a box that I had connected to in the past which it's IP got recycled. When I tried to do a terraform apply, it just kept 'creating' at the file provisioning stage and waiting until it fully timed out or an interrupt happens.
First, I'd say that the messaging isn't great since I wasted an hour just figuring out why I couldn't apply to this instance, it just seemed stuck, and second, it would be great to have an option where you can specifically mention to ignore strict host checking.
@mboudreau,
Terraform does not use the system ssh binaries and does not read the known_hosts
file. If terraform is unable to connect to the remote host, it's unrelated to that file or any configuration options in your ssh config.
@jbardin yep you're right, I jumped the gun too quickly on that. I still can't figure out why rsync seems to just hang and then disconnect when doing file provisioning. I can't even get it working with rsync directly, which is weird...
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.
Terraform Version
Terraform Configuration Files
Debug Output
2017-12-18T13:31:13.798Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:13 connecting to TCP connection for SSH 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 connection error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 [WARN] retryable error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:16.806Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:16 [INFO] sleeping for 1s 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "root", waiting for: "provider.null (close)" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "provider.null (close)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "provisioner.remote-exec (close)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017/12/18 13:31:17 [TRACE] dag/walk: vertex "meta.count-boundary (count boundary fixup)", waiting for: "module.cassandra1.null_resource.remote-exec" 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 connecting to TCP connection for SSH 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 connection error: dial tcp 10.0.0.11:22: getsockopt: connection refused 2017-12-18T13:31:17.807Z [DEBUG] plugin.terraform: remote-exec-provisioner (internal) 2017/12/18 13:31:17 [WARN] retryable error: dial tcp 10.0.0.11:22: getsockopt: connection refused
Crash Output
Expected Behavior
The remote-exec fails. But playing the "same command": ssh -i /tmp/oci_odc_admin 10.0.0.11 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no It works. Without the options to ignore HostChecking, there is a prompt because this is the first connection. I guess the problem is related to this prompting not managed in Go ssh code. (communicator.go of the ssh package)
Actual Behavior
Steps to Reproduce
terraform init terraform apply
Important Factoids
References