Proposal: port forwarding via SSH tunnel

nicolai86 commented 8 years ago

proposal: port forwarding via SSH tunnel

I'd like to start adding port forwarding via SSH tunnels to terraform.

This is useful when you want to use terraform with systems which are only accessible via a jump host, ie. company internal systems.

Right now terraform already ships with a bunch of providers which might need to talk to internal systems (e.g. postgres/ mysql/ influxdb/…).

The status quo is to create a SSH tunnel beforehand, or, in cases where the entire infrastructure is created from scratch, to be split terraform scripts into multiple stages with glue code outside. E.g. one might setup a private cluster with a jump host, open an SSH tunnel via bash, and then run a differen terraform script using the newly created tunnel to access private systems, all wrapped in a single setup.sh script.

Assuming that the SSH tunnel is required for all resources of a given provider, I suggest adding connection settings to the terraform providers as well, like this:

provider "consul" {
    address = "localhost:80"
    datacenter = "nyc1"

    # run "ssh -L localhost:80:demo.consul.io:80" for any resources of this provider
    connection {
        user = "private-user"
        host = "private.jump-host.io"

        forward {
            remote_host = "demo.consul.io"
            remote_port = 80
            local_port = 80
        }
    }
}

# Access a key in Consul; consul is only available via SSH tunnel
resource "consul_keys" "app" {
    key {
        name = "ami"
        path = "service/app/launch_ami"
        default = "ami-1234"
    }
}

Looking forward to any feedback before I head of adding something like this to terraform… ;)

Related: #4442, #4775

jen20 commented 8 years ago

Hi @nicolai86! I'm certainly not opposed to this, though I'm not sure exactly what it would look like. Going to cc @phinze or @mitchellh here for a second opinion on this.

jbardin commented 8 years ago

This sounds reasonable to me. My only comment so far would be to name it something like local_forward to align it with the actual type of forwarding being done (-L), and leave room in case we find a need for remote_forward (-R) later on.

apparentlymart commented 8 years ago

This is an interesting approach. I have some feedback, but really just exploring the idea:

Given that the connection doesn't really "belong to" the provider, I wonder if we should hoist it out to the top level, and add some interpolation variables for it like this:

provider "consul" {
    # expands to the local listen address and port that the "connection" created
    address = "${connection.consul_tunnel.local_address}"
    datacenter = "nyc1"
}

connection "consul_tunnel" {
    type = "ssh"
    user = "private-user"
    host = "private.jump-host.io"

    forward {
        remote_host = "demo.consul.io"
        remote_port = 80
        local_port = 80
    }
}

Presumably for real use the user would sometimes need to provide some credentials in the connection block (either a private key or a password), so the ability to interpolate from variables would be useful to avoid hard-coding those credentials in the config.

It could also be nice to make the local port optional and have Terraform just allocate any arbitrary open port and expose it via the interpolation variable, so the user doesn't need to think about what port is likely to be open on all machines where Terraform might be run.

Wondering if maybe it would be more intuitive to invert the nesting, so that the forwarder is the primary object and it takes a connection as part of its configuration, similar to how resources and provisioners work:

port_forward "consul_tunnel" {
    remote_host = "demo.consul.io"
    remote_port = 80

    connection {
        # Now the connection block is the same as in other contexts, as long
        # as the selected connection type supports port forwarding.
        type = "ssh"
        user = "private-user"
        host = "private.jump-host.io"
    }
}

provider "consul" {
    address = "${port_forward.consul_tunnel.local_address}"
}

nicolai86 commented 8 years ago

I think exposing the port forwarding as a primitive is a good idea in terms of reuse between multiple resources, and it might also help with code reuse given that the connection attribute already exists on resources. I'm also hoping for a clean integration into the execution graph.

It seems that the general theme is "this is a worthwhile addition" and the questions are mostly minor details. Since I have no idea at all about the terraform core internals I'll take a deep dive and report back in a couple of days…

apparentlymart commented 8 years ago

@nicolai86 I would suggest giving @phinze and/or @mitchellh a chance to respond since they know Terraform (and its roadmap) the best and are likely to give more detailed feedback. Of course, that doesn't mean you can't dig in and start learning about Terraform core. :grinning:

nicolai86 commented 8 years ago

don't worry, just want to start learning about terraform core internals. Did I sound like I will go of building yet? 😅

apparentlymart commented 7 years ago

Reflecting on this a while later...

At work we took the approach of running Terraform on a host within the network it's being deployed to, and running it with an automation tool.

This has been working out really well for us:

The problem being described here is moot, because there's no bastion wall between Terraform and the services it's trying to configure.
The hole we had to poke to allow our build system to trigger a deployment is small: it's just triggering a job in the automation tool we use via a very well-defined API. The only thing possible to do remotely (via a secure channel) is to tell the Terraform "deploy worker" machine to deploy the Terraform configuration at the HEAD of our git repo, and so we don't need to expose SSH access to these Terraform machines, bastion or otherwise.
It encourages other good practices around running Terraform in a very specific environment that's managed by configuration management, which prevents weird little issues caused by running Terraform on different OSes and different machines.
Terraform can obtain auth credentials it needs from a credential store within the environment, so the credentials never need to appear on any machine outside of the walls of the target network nor be known directly by any human operator. (We currently do this with a home-grown wrapper script that sets environment variables, rather than with Terraform itself.)

So with all of that said, while it'd be great to have a feature like what was proposed here in the long run so that Terraform can be flexible to run in a variety of different environments, in the short term I'd wholeheartedly recommend that folks consider this alternative approach which has worked out very well for us.

AFAIK such a setup is not possible with Atlas today, in which case I would also suggest that it would be a great feature to be able to use the Atlas UI to control "agents" running within a private network over a secure channel as an alternative to running Terraform on Hashicorp-run infrastructure, which would then enable the above configuration with Atlas as the orchestration tool.

cakeface commented 7 years ago

I think running Terraform on a server within the VPC is a nice work around for this problem but it has a bootstrapping issue. Where does this server come from initially? Terraform. It means admitting that you have to split your infrastructure management and cannot stand the entire thing up with one run of Terraform.

I also have multiple VPCs that are managed from one Terraform source repository. Applying changes now involves connecting to multiple Terraform nodes and running the updates. And splitting the code out.

All of that is possible, and I can even automate with Fabric or Bash but I don't like adding more tools when Terraform is supposed to be the tool. Also I'm layering scripted automation on top of my very nice declarative automation which just makes me feel a little gross.

For me, I added the SSH tunnel step to a plan and apply shell wrapper for now.

apparentlymart commented 7 years ago

Yes, it is the case that we had to bootstrap the environment from outside and that there is one Terraform config that requires custom effort to apply because it affects the deploy workers themselves. A temporary extra machine booted manually from the same AMI as the deploy workers addresses that problem, but I certainly won't claim that this is super convenient. It's just a compromise that we tolerate because we apply this configuration relatively infrequently compared to the others that deal with our applications themselves.

hingstarne commented 7 years ago

Hi guys, just wanted to add my 5 cent and try to revive this topic. From my perspective to move the tunnel out of the provider looks smart, but has a severe disadvantage. If you have a remote exec or a file copy the ssh connection is closed after that, so nothing to clean. It just simple exits. Even if terraform crashes. If you would implement a tunnel this way.

port_forward "consul_tunnel" {
    remote_host = "demo.consul.io"
    remote_port = 80

    connection {
        # Now the connection block is the same as in other contexts, as long
        # as the selected connection type supports port forwarding.
        type = "ssh"
        user = "private-user"
        host = "private.jump-host.io"
    }
}

You need a destructor in the code that can also be triggered. So from this logic extend the existing connection and add it to certain providers or ressources would be the more safer route to go.

PLaRoche commented 7 years ago

Any progress on this? We have to open an SSH tunnel every time we run terraform as it manages our RDS instances that are private only.

matelang commented 7 years ago

This is a major blocker for us as well.

rata commented 7 years ago

What we are thinking as a workaround, but of course doesn't help all, is to use a kubernetes job to run terraform plan/apply.

As it runs in the cluster, it has access to the private resources, and it's easy to run for all (using a web interface for kubernetes) without needing manually setup tunnels, credentials for those and all. And the idea is to use a remote tfstate on S3 (or something else).

I'll update if we have the time to go more on this path. But, of course, will only help people also running kubernetes clusters :)

automaticgiant commented 7 years ago

i mostly just (right now) want to be able to provision a vm with docker and forward the docker.sock so that terraform can deploy containers onto it without having to set up tcp listener (because i won't want it later anyway.)

fquffio commented 7 years ago

Any progress on this? It's almost a year now… The mentioned Terraform gurus were asked for an opinion but didn't reply. Is this issue abandoned?

Bastion hosts are quite common, and relying on external scripts to create an SSH tunnel before Terraform can operate sucks, makes the whole process way more complicated since there are more steps that you must remember of, makes your project far more difficult to maintain if you have multiple resources that require such feature (Redis, MySQL, ElasticSearch, Consul, …), and can be very dangerous if you're working with multiple environments (it's kinda easy to launch terraform apply on dev when you still have your tunnel pointing to production database, and vice versa). I definitely can't see why this issue is considered so low priority?

apparentlymart commented 7 years ago

Hi @fquffio!

Before I respond I should explain that at the time of my last comments I was an outside open source contributor, but in the meantime I've become a HashiCorp employee working on Terraform.

It is not that this issue is considered low priority, but rather that there are many issues that are all considered important. There remains design work to do to figure out exactly how this will work, and then non-trivial implementation work to get it actually done.

Believe me that I really want to see this feature too, and we'll get there. We're working through the feature request backlog as fast as we can while also keeping up with bug fixes, etc. I understand the frustration and I can only ask for continued patience.

At this time, my hope is to move forward with a configuration structure somewhat like the following, taken from my comment above:

port_forward "consul_tunnel" {
    target_host = "demo.consul.io"
    target_port = 80

    connection {
        # Now the connection block is the same as in other contexts, as long
        # as the selected connection type supports port forwarding.
        type = "ssh"
        user = "private-user"
        host = "private.jump-host.io"
    }
}

provider "consul" {
    address = "${port_forward.consul_tunnel.local_address}"
}

It'll take a little more prototyping to figure out the details of this, such as how we can wire the connection creation and shutdown into the graph, whether the existing connection mechanism can be extended to support tunnels in this way, etc. We'll have more to say here when we are able to complete that prototyping.

spanktar commented 7 years ago

I'm also interested in this and suggest something along these lines: using a connection block inside the provider:

provider "consul" {
  address = "${aws_route53_record.elb_consul.fqdn}"
  datacenter = "dc1"

  connection {
    type = "tunnel"
    host = "${aws_instance.bastion_1.public_ip}"
    port = "8500"
    private_key = "${file("${var.local_ssh_key_path}")}"
    user = "${var.ssh_user}"
  } 
}

ekristen commented 7 years ago

While I like this approach I think it is sensible for long term, I have to wonder if it would not be easier to get bastion support as it exists today with aws_instance, and other resources added to resources like postgres_database, etc so that people can start using it today.

Either way, I'm a big +1 for supporting bastion hosts on more resources.

madmod commented 7 years ago

+1 I think this same pattern could be good for supporting VPN access to resources. Having the ssh tunnel be a resource which depends on other resources (Like the bastion instance for example) would solve any ordering issues on first run.

cloudvant commented 7 years ago

@apparentlymart it is time to fix this. You have been dancing around the issue for too long. Either fix it or close it but you have kept us waiting for too long.

nbering commented 7 years ago

@vmendoza That comment seems a little out of line for a free open source project. If you feel so strongly about it... dig in and write some code.

kwerle commented 7 years ago

I would also request that if/when this is implemented there be a remote_command portion. I specifically want to forward a port to a service that I want to launch as I make the connection.

ssh -L my_port:target_port host some_service_providing_access_on_target_port

stefansundin commented 6 years ago

Hello everyone. I decided to try to tackle this myself by building a custom provider. And I'm happy to say that I'm quite pleased with the result. It works by declaring a data source (basically, what you want is a local port that you want to be forwarded somewhere via SSH).

While I am sure there are many things that can be improved, what is great about my solution is that it is usable right now.

I'd like to invite everyone who is having this issue to try it out. Here's the repository: https://github.com/stefansundin/terraform-provider-ssh

Please be careful and do not use in production quite yet. If it breaks something you can keep both pieces. :)

As always, suggestions for improvements are welcome! Thanks all!

jaymecd commented 6 years ago

@stefansundin nice! but there is an issue: tunnel is not recreated on apply - stefansundin/terraform-provider-ssh#1

stefansundin commented 6 years ago

Hey @apparentlymart. I've been trying to figure out the issue that @jaymecd reported, but I couldn't find any good solution. Any chance you could take a quick look and say whether or not it is even solvable (or impossible as of right now). There is more info here: https://github.com/stefansundin/terraform-provider-ssh/issues/1

Thanks!

dangregorysony commented 6 years ago

A simple solution to this and similar issues is to provide an option to use a local OpenSSH client binary instead of Go's native ssh implementation. This would allow us to use ProxyCommand to create whatever kind of tunneling we need. See #4523.

I think the developers of Docker Machine got this one right - they use the local 'ssh' binary if present and only fall back on the native Go crypto/ssh implementation when no binary is available (or is explicitly requested - see https://docs.docker.com/machine/reference/ssh/).

OpenSSH is ubiquitous and highly configurable - is there really any benefit in attempting to re-implement some of its features in Terraform?

spanktar commented 6 years ago

Checking back in on this. Would love to see this someday!

Also, I can confirm the plugin works great as a stopgap measure until native support is added! Great work @stefansundin!

trinitronx commented 6 years ago

If you want to use standard tools like curl with a SOCKS5h proxy via SSH tunnel, then you are in luck!

I've found a working solution for docker containers to access services via socks5h://. See my comment & diagram in issue: #17754

This works with local-exec provider! An example use case was that I needed to bootstrap Vault Server CA Certificates for use later in other Terraform resources. However, this Vault server was only accessible inside our secure VPC behind a bastion host. Additionally, it was a private hosted Route53 zone that is only resolvable from within the VPC, so the SOCKS5+h type protocol was important for DNS to resolve!

For example:

# Note: 172.16.222.111 is the alias IP for the host laptop running terraform in docker container
# See diagram in issue comment for #17754 above for clarification!
resource "null_resource" "vault-web-ca" {
  triggers {
    id = "${uuid()}"
  }
  provisioner "local-exec" {
    command = <<EOF
      ALL_PROXY="socks5h://172.16.222.111:${var.socks_proxy_port}";
      HTTP_PROXY="$${ALL_PROXY}";
      HTTPS_PROXY="$${ALL_PROXY}";
      export ALL_PROXY HTTP_PROXY HTTPS_PROXY;
      echo '${data.aws_ssm_parameter.vault-ca-crt.value}' > /tmp/vault-ca.crt && \
      sync && \
      curl -s -k -o - https://vault-${var.env}.${local.private_dns_zone_name}/v1/ca/web/ca_chain > ${path.module}/generated/vault-web-ca-chain.crt && \
      curl -s -k /tmp/vault-ca.crt -o - https://vault-${var.env}.${local.private_dns_zone_name}/v1/ca/web/ca/pem > ${path.module}/generated/vault-web-ca.crt && \
      sync
EOF
  }
}

# Now we can read in these generated cert files and use them later in Terraform
data "local_file" "vault-web-ca-chain" {
  depends_on = ["null_resource.vault-web-ca"]
  filename = "${path.module}/generated/vault-web-ca-chain.crt"
}

data "local_file" "vault-web-ca" {
  depends_on = ["null_resource.vault-web-ca"]
  filename = "${path.module}/generated/vault-web-ca.crt"
}

The current problem is that Terraform itself does not support socks5h://. This is possibly due to an upstream bug in Golang regarding socks5h:// support in x/net/proxy (golang/go#13454). If this is ever fixed, perhaps Terraform providers and code that uses standard x/net/proxy library will just work!

gliptak commented 5 years ago

https://go-review.googlesource.com/c/net/+/156517/ was merged

rgarrigue commented 4 years ago

Here's my workaround for this, in case it helps. I'm using Terragrunt so here's the terragrunt.hcl, a hook is enought to keep it in the workflow without changing the habits.

Note, I tried various combination of nohup, (fork), ((double fork)), &, only screen did the trick.

include {
  path = find_in_parent_folders()
}

terraform {
  # source = "git::git@github.com:terraform-aws-modules/terraform-aws-rds.git//modules/db_instance?ref=v2.5.0"
  source = "."

  before_hook "open_tunnel_through_bastion" {
    commands     = ["plan", "apply", "show", "destroy"]
    execute      = ["screen", "-d", "-m", "ssh", "-L", "12345:${dependency.instance.outputs.this_db_instance_address}:${dependency.instance.outputs.this_db_instance_port}", dependency.bastion.outputs.hostname, "sleep", "60"]
  }
}

dependency "bastion" {
  config_path = "../../../bastion/"
  mock_outputs = {
    hostname = "localhost"
  }
}

dependency "instance" {
  config_path = "../../instance/"
  mock_outputs = {
    this_db_instance_address  = "localhost"
    this_db_instance_port     = 12345
    this_db_instance_username = "mockup_user"
  }
}

inputs = {
  host = "localhost"
  port = "12345"

  postgres_user     = dependency.instance.outputs.this_db_instance_username
  postgres_password = "REDACTED"

  db_name       = "REDACTED"
  db_password   = "REDACTED"
  db_extensions = ["uuid-ossp", "pgcrypto"]
}

Note : at some point, don't remember why exactly, I had to move the execute command to a script and call it from the execute

binlab commented 4 years ago

Any progress here with this?

RobRoseKnows commented 4 years ago

I'm wondering what the hold up on this is, found this through #4775 and it's nearly 4 years old. Doesn't similar code to accomplish this already exist in provisioners? Or are there other blockers?

JnMik commented 4 years ago

That feature would help a lot for readability when Terraform is interacting with private cloud resources. I currently use null_resource.local-exec to spawn proxies through my bastion host, but it's definitely not clean code.

@rgarrigue Thanks for sharing your way, this looks clean I think I'll give it a try @apparentlymart Do you know if this idea it still in sight ?

Also, has anyone thought it would be any good to consider forwarding ports to a kubernetes container directly in this feature ? Like using kubectl port-forward in the background.

Could be nice to use Terraform database, consul or vault providers, when they are running inside the Kubernetes.

apparentlymart commented 4 years ago

This is still a possible future feature, but at this time nobody on the Terraform team at HashiCorp is working on this due to priorities being elsewhere.

In the meantime, we hear that users are employing some other strategies to address this problem within Terraform's current capabilities:

Run Terraform on a system on the other side of the bastion, which has direct network access to the services Terraform is managing.
Use a general IP VPN rather than SSH tunnel to give the system running Terraform access to the services it will manage, so that the indirection is invisible to individual applications like Terraform.

jcrsilva commented 4 years ago

@apparentlymart the problem with that strategy is that it requires a VPN to already be set up before a terraform run, or for the connection to be set up during the run. That's fine if you already have previously existing infrastructure, but in my specific case I would like to be able to provision, and destroy, everything from scratch. That includes possible VPN servers, and using a managed VPN solution, while possible with some local-exec magic to connect to it, would be costly in the long run since terraform doesn't support (by design) creating and destroying an ephemeral resource in the same run.

I'm going to investigate possible workarounds.

hongkongkiwi commented 3 years ago

This is still a possible future feature, but at this time nobody on the Terraform team at HashiCorp is working on this due to priorities being elsewhere.

In the meantime, we hear that users are employing some other strategies to address this problem within Terraform's current capabilities:

Run Terraform on a system on the other side of the bastion, which has direct network access to the services Terraform is managing.

Use a general IP VPN rather than SSH tunnel to give the system running Terraform access to the services it will manage, so that the indirection is invisible to individual applications like Terraform.

Neither of these solutions are compatible with Terraform Cloud which is a major downside. A terraform native option (e.g. simply the option to provide a ssh private key, ssh user and host to first initiate the connection would be really great.

ion1 commented 3 years ago

Now that Boundary is a thing, could Terraform support connecting to services through it?

djakielski commented 3 years ago

This is still a possible future feature, but at this time nobody on the Terraform team at HashiCorp is working on this due to priorities being elsewhere.

In the meantime, we hear that users are employing some other strategies to address this problem within Terraform's current capabilities:
* Run Terraform on a system on the other side of the bastion, which has direct network access to the services Terraform is managing.

* Use a general IP VPN rather than SSH tunnel to give the system running Terraform access to the services it will manage, so that the indirection is invisible to individual applications like Terraform.

This isn't an working solution for access private kubernetes service approach. I need some solution where i can use kubectl port-forward with an random local port on plan or apply phase and stops kubectl process after terraform finished.

As an workaround, null_resource with local-exec could be working. But will it terminate process on finishing?

jcrsilva commented 3 years ago

@djakielskiadesso either it will terminate the process, or it will timeout and error out. If you're interested in going that route you need to find a way to persist the process, for example, with a systemd service.

WhyNotHugo commented 3 years ago

I'm using this as a workaround for now:

resource "random_integer" "ssh_port" {
  min = "10000"
  max = "60000"
}

resource "null_resource" "ssh_port_forward" {
  provisioner "local-exec" {
    command     = file("ssh-port-forward.sh")
    interpreter = ["bash", "-c"]
    environment = {
      INSTANCE_ID = aws_instance.JumpHost.public_ip
      USERNAME    = "myusername"
      RANDOM_PORT = random_integer.ssh_port.result
      TARGET      = "rabbitmq.mydomain:15672"
    }
  }
}

provider "rabbitmq" {
  endpoint = "http://127.0.0.1:${random_integer.ssh_port.result}"
  username = "guest"
  password = "guest"
}

The script referrence is this:

#!/usr/bin/env bash
#
# From https://dev.to/jaysonsantos/using-terraform-s-remote-exec-provider-with-aws-ssm-5po

set -ex
test -n "$INSTANCE_ID" || (echo missing INSTANCE_ID; exit 1)
test -n "$USERNAME"    || (echo missing USERNAME; exit 1)
test -n "$RANDOM_PORT" || (echo missing RANDOM_PORT; exit 1)
test -n "$TARGET"      || (echo missing TARGET; exit 1)

set +e

cleanup() {
    cat log.txt
    rm -rf log.txt
    exit $!
}

for try in {0..25}; do
    echo "Trying to port forward retry #$try"
    # The following command MUST NOT print to the stdio otherwise it will just
    # inherit the pipe from the parent process and will hold terraform's lock
    ssh -f -o StrictHostKeyChecking=no \
        -o ControlMaster=no \
        "$USERNAME@$INSTANCE_ID" \
        -L "127.0.0.1:$RANDOM_PORT:$TARGET" \
        sleep 1h &> log.txt  # This is the special ingredient!
    success="$?"
    if [ "$success" -eq 0 ]; then
        cleanup 0
    fi
    sleep 5s
done

echo "Failed to start a port forwarding session"
cleanup 1

The main issue here is that the SSH connection is not re-created automatically once it closes. I've worked around this by using terraform taint null_resource.ssh_port_forward. I do believe an external_data_source might work best though (especially since it can handle the reconnect-if-needed part).

I do think that if terraform implemented this natively, a resource makes sense the most (especially since you can still use things like random_integer.ssh_port for the local port number.

This uses ssh on the host, and expects that you've added the key to the agent beforehand. I do feel more comfortable with this approach, though it would be fine if the in-terraform implementation had an optional parameter to provide a key file (you can use a variable key in git-crypt or whatever you prefer).

gnom7 commented 3 years ago

@WhyNotHugo

The main issue here is that the SSH connection is not re-created automatically once it closes. I've worked around this by using terraform taint null_resource.ssh_port_forward

Have you tried triggers?

resource "null_resource" "ssh_port_forward" {
  triggers = {
    always = timestamp()
  }
  ...
}

zioalex commented 3 years ago

Looks like there is a common agreement on this...even after 5 years. This is a pity because it could help in various enterprise use cases.

ztripez commented 3 years ago

To be able to port forward (with ssh and kubectl) would make the bootstrapping so much easier for us.

pjanuario commented 3 years ago

I am also looking for a workaround, ideally by using a kctl port-forward since CI will run on the k8 cluster, but would be handy to be able to execute terraform locally.

sohel2020 commented 3 years ago

I'm using this as a workaround for now:
resource "random_integer" "ssh_port" {
  min = "10000"
  max = "60000"
}

resource "null_resource" "ssh_port_forward" {
  provisioner "local-exec" {
    command     = file("ssh-port-forward.sh")
    interpreter = ["bash", "-c"]
    environment = {
      INSTANCE_ID = aws_instance.JumpHost.public_ip
      USERNAME    = "myusername"
      RANDOM_PORT = random_integer.ssh_port.result
      TARGET      = "rabbitmq.mydomain:15672"
    }
  }
}

provider "rabbitmq" {
  endpoint = "http://127.0.0.1:${random_integer.ssh_port.result}"
  username = "guest"
  password = "guest"
}
The script referrence is this:
#!/usr/bin/env bash
#
# From https://dev.to/jaysonsantos/using-terraform-s-remote-exec-provider-with-aws-ssm-5po

set -ex
test -n "$INSTANCE_ID" || (echo missing INSTANCE_ID; exit 1)
test -n "$USERNAME"    || (echo missing USERNAME; exit 1)
test -n "$RANDOM_PORT" || (echo missing RANDOM_PORT; exit 1)
test -n "$TARGET"      || (echo missing TARGET; exit 1)

set +e

cleanup() {
    cat log.txt
    rm -rf log.txt
    exit $!
}

for try in {0..25}; do
    echo "Trying to port forward retry #$try"
    # The following command MUST NOT print to the stdio otherwise it will just
    # inherit the pipe from the parent process and will hold terraform's lock
    ssh -f -o StrictHostKeyChecking=no \
        -o ControlMaster=no \
        "$USERNAME@$INSTANCE_ID" \
        -L "127.0.0.1:$RANDOM_PORT:$TARGET" \
        sleep 1h &> log.txt  # This is the special ingredient!
    success="$?"
    if [ "$success" -eq 0 ]; then
        cleanup 0
    fi
    sleep 5s
done

echo "Failed to start a port forwarding session"
cleanup 1
The main issue here is that the SSH connection is not re-created automatically once it closes. I've worked around this by using terraform taint null_resource.ssh_port_forward. I do believe an external_data_source might work best though (especially since it can handle the reconnect-if-needed part).

I do think that if terraform implemented this natively, a resource makes sense the most (especially since you can still use things like random_integer.ssh_port for the local port number.

This uses ssh on the host, and expects that you've added the key to the agent beforehand. I do feel more comfortable with this approach, though it would be fine if the in-terraform implementation had an optional parameter to provide a key file (you can use a variable key in git-crypt or whatever you prefer).

@WhyNotHugo isn't it failing while you run terrafrom plan because when you run plan it doesn't create a tunnel but terraform try to run plan over rabbitmq

WhyNotHugo commented 3 years ago

@sohel2020 Yeah, and there have been other issues too. I've (sadly) resorted back to just running the tunnel manually when I need to update those resource.

If someone can thing of a mechanism that makes sense using existing APIs, I can try hacking something, but so far, I don't have any full idea on how to implement something good enough.

jaysonsantos commented 3 years ago

Hey folks, I also had the same issue and created a provider [1] [2] to try and tackle that, would you mind giving it a try? I wrote a bit about it here as well [3].

[1] https://github.com/jaysonsantos/terraform-provider-jumphost [2] https://registry.terraform.io/providers/jaysonsantos/jumphost [3] https://dev.to/jaysonsantos/connecting-to-services-that-require-jumphost-from-terraform-1aog

WhyNotHugo commented 3 years ago

Nice, the approach is pretty smart. I had a quick look at the code and all looks sane. However, I'm always getting a "connection refused":

╷
│ Error: error detecting capabilities: error PostgreSQL version: dial tcp [::1]:36843: connect: connection refused
│
│   with postgresql_database.django["staging"],
│   on postgres.tf line 17, in resource "postgresql_database" "django":
│   17: resource "postgresql_database" "django" {
│
╵

Authentication seems to be fine though, since providing a bogus username gives another error (that clearly indicates that auth has failed).

Any ideas?

jaysonsantos commented 3 years ago

hey @WhyNotHugo that looks like a race condition where the db tried to connect before the tunnel had the port open. Do you mind opening an issue with a minimal terraform file to test? thank you!

seanamos commented 2 years ago

For those still struggling with this, I'd like to point out a module that has a clever solution for this: https://github.com/flaupretre/terraform-ssh-tunnel I've been using it with great success, it works with plan/apply/destroy etc. and cleans up after itself.

Since we don't use SSH any longer and use AWS SSM, I was also able to adapt the module fairly easily to use the aws cli to start SSM sessions to forward ports.

hashicorp / terraform

Proposal: port forwarding via SSH tunnel #8367