ansible-collections / community.general

Ansible Community General Collection
https://galaxy.ansible.com/ui/repo/published/community/general/
GNU General Public License v3.0
832 stars 1.53k forks source link

Long Terraform Scripts timeout when called from Ansible, but not when called directly #3198

Closed fdervisi closed 3 years ago

fdervisi commented 3 years ago

Summary

I have a multi region AWS deployment which takes 8-12min. When I use Ansible to start the Terraform Apply, it always fails with large multi region deployments:

- name: Execute Terraform apply
  community.general.terraform:
    project_path: "{{ project_path }}"
    state: present
    force_init: true
  register: terraform

When I issue "terraform apply" manually it always works. How can I get rid of this timing issue?

Issue Type

Bug Report

Component Name

terraform

Ansible Version

ansible [core 2.11.3] 
  config file = None
  configured module search path = ['/home/ec2-user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /home/ec2-user/.local/lib/python3.7/site-packages/ansible
  ansible collection location = /home/ec2-user/.ansible/collections:/usr/share/ansible/collections
  executable location = /home/ec2-user/.local/bin/ansible
  python version = 3.7.10 (default, Jun  3 2021, 00:02:01) [GCC 7.3.1 20180712 (Red Hat 7.3.1-13)]
  jinja version = 3.0.1
  libyaml = True

Configuration


resource "aws_instance" "linux_host" {
  count                       = var.tgw ? 0 : 1
  ami                         = data.aws_ami.amazon_linux_2.id
  instance_type               = "t2.micro"
  key_name                    = var.key_pair
  associate_public_ip_address = true
  vpc_security_group_ids      = [var.sg_sdwan_lan0]
  subnet_id                   = aws_subnet.lan.id
  private_ip                  = var.lan_linux_ip
  tags = {
    Name  = var.linux_hostname,
    Owner = "Fatos"
  }
    provisioner "remote-exec" {
      inline = [
        "sudo ip route add 10.0.0.0/8 via ${var.lan_sdwan_ip} dev eth0",
        "sudo ip route add 10.${var.region_index}.0.2/32 via 10.${var.region_index}.${var.site}.1 dev eth0"
      ]
    }
    connection {
      type     = "ssh"
      user     = "ec2-user"
      password = ""
      private_key = file(var.private_key_path)
      host        = self.public_ip
    }  
}

OS / Environment

CentOS 7, AWS Linux

Steps to Reproduce

Expected Results

Terraform should work when called from Ansible

Actual Results

  fatal: [localhost]: FAILED! => changed=false 
  cmd: /usr/bin/terraform apply -no-color -input=false -auto-approve -lock=true /tmp/tmp4ohxifuk.tfplan
  invocation:
    module_args:
      backend_config: null
      backend_config_files: null
      binary_path: null
      check_destroy: false
      force_init: true
      init_reconfigure: false
      lock: true
      lock_timeout: null
      overwrite_init: true
      plan_file: null
      plugin_paths: null
      project_path: terraform_aws/
      purge_workspace: false
      state: present
      state_file: null
      targets: []
      variables: null
      variables_files: null
      workspace: default
  msg: |2-

    Error: remote-exec provisioner error

      with module.sdwan_site3.aws_instance.linux_host[0],
      on module/sdwan_instance/main.tf line 139, in resource "aws_instance" "linux_host":
     139:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.197.60.80:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site6.aws_instance.linux_host[0],
      on module/sdwan_instance/main.tf line 139, in resource "aws_instance" "linux_host":
     139:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.230.71.9:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site2.aws_instance.linux_host,
      on module/sdwan_router_instance/main.tf line 143, in resource "aws_instance" "linux_host":
     143:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.193.89.25:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site5.aws_instance.linux_host,
      on module/sdwan_router_instance/main.tf line 143, in resource "aws_instance" "linux_host":
     143:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.229.132.179:22: i/o timeout
  rc: 1
  stderr: |2-

    Error: remote-exec provisioner error

      with module.sdwan_site3.aws_instance.linux_host[0],
      on module/sdwan_instance/main.tf line 139, in resource "aws_instance" "linux_host":
     139:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.197.60.80:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site6.aws_instance.linux_host[0],
      on module/sdwan_instance/main.tf line 139, in resource "aws_instance" "linux_host":
     139:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.230.71.9:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site2.aws_instance.linux_host,
      on module/sdwan_router_instance/main.tf line 143, in resource "aws_instance" "linux_host":
     143:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.193.89.25:22: i/o timeout

    Error: remote-exec provisioner error

      with module.sdwan_site5.aws_instance.linux_host,
      on module/sdwan_router_instance/main.tf line 143, in resource "aws_instance" "linux_host":
     143:     provisioner "remote-exec" {

    timeout - last error: dial tcp 18.229.132.179:22: i/o timeout

Code of Conduct

ansibullbot commented 3 years ago

Files identified in the description:

If these files are inaccurate, please update the component name section of the description or use the !component bot command.

click here for bot help

ansibullbot commented 3 years ago

cc @m-yosefpor @rainerleber click here for bot help

felixfontein commented 3 years ago

Could you please elaborate what exactly times out? Ansible? The SSH connection? Some other random thing?

This here:

Actual Results

dfdf

is quite likely not the output you received.

needs_info

fdervisi commented 3 years ago

I just updated the error. So it seams in the first run I got a timeout from the provisioner. If I rerun the Playbook, everything works, this only happen when I have long running Terraform process.

I have network connectivity to the hosts, but somehow it does not work on the first rund with the ansible-playbook

rainerleber commented 3 years ago

@fdervisi this seams to be a problem on the provisioner/terraform or infrastructure side not a ansible terraform module problem. You can see it for example here: 'timeout - last error: dial tcp 18.230.71.9:22: i/o timeout'

fdervisi commented 3 years ago

Indeed, I verified it again and it looks like Terraform is the culprit. So we can close the issue