hashicorp / terraform

Terraform enables you to safely and predictably create, change, and improve infrastructure. It is a source-available tool that codifies APIs into declarative configuration files that can be shared amongst team members, treated as code, edited, reviewed, and versioned.
https://www.terraform.io/
Other
42.52k stars 9.52k forks source link

The self Object - It is possible to reference PARENT RESOURCE by NAME #32523

Open adi-garg opened 1 year ago

adi-garg commented 1 year ago

Terraform Version

Terraform v1.3.6

Affected Pages

1.https://developer.hashicorp.com/terraform/language/resources/provisioners/syntax#the-self-object 2.https://developer.hashicorp.com/terraform/language/resources/provisioners/connection

What is the docs issue?

1.SELF-OBJECT: The lines: _Expressions in provisioner blocks cannot refer to their parent resource by name. Instead, they can use the special self object. The self object represents the provisioner's parent resource, and has all of that resource's attributes. For example, use self.public_ip to reference an aws_instance's publicip attribute.

AND Technical note: Resource references are restricted here because references create dependencies. Referring to a resource by name within its own block would create a dependency cycle.

seem to indicate that it is not possible to use something like command = "echo ${aws_instance.myec2.private_ip} >> private_ips.txt"

inside provisioner block with references to own resource. 'self' object is suggested instead.

Infact,the below code works:


resource "aws_instance" "myec2" {
  ami = "ami-074dc0a6f6c764218"
  instance_type = "t2.micro"
  tags={
        Name="my-ec2"
    }
  provisioner "local-exec" {

   command = "echo ${aws_instance.myec2.private_ip} >> private_ips.txt & echo ${aws_instance.myec2.availability_zone} >> avail.txt" 

  }

}

The documentation in turn also makes such references as seen here. "consul join ${aws_instance.web.private_ip}",

2.CONNECTION DOCUMENT:

Again, the following lines may be flawed-

The self Object

Expressions in connection blocks cannot refer to their parent resource by name. References create dependencies, and referring to a resource by name within its own block would create a dependency cycle. Instead, expressions can use the self object, which represents the connection's parent resource and has all of that resource's attributes. For example, use self.public_ip to reference an aws_instance's public_ip attribute.

The below code works well while using a reference to own parent using name: connection { type = "ssh" user = "ec2-user" private_key = file("./tf-key.pem")

host = self.public_ip

host = aws_instance.myec2.public_ip }

/*
NOTE:
1.Make sure PEM key with right name is present in the directory
2.Changes to permissions of key:remove inheritance+ remove access for other users except user logged into the Windows from where terraform commands are being given
*/

resource "aws_instance" "myec2" {
   ami = "ami-074dc0a6f6c764218"
   instance_type = "t2.micro"
   key_name = "tf-key"

   ##associate interface to Security Group (reference NAME not ID as a list)
   security_groups=[aws_security_group.my-sg.name]

   connection {
   type     = "ssh"
   user     = "ec2-user"
   private_key = file("./tf-key.pem")
   #host     = self.public_ip
   host     = aws_instance.myec2.public_ip
    }

 provisioner "remote-exec" {
   inline = [
     "sudo amazon-linux-extras install -y nginx1",
     "sudo systemctl start nginx"
   ]
 }
}

##creation of security group
resource "aws_security_group" "my-sg" {
  name        = "my-sg"
  description = "Allow HTTP and SSH Inbound/Allow Internet Outbound"
  #Without vpc_id ->defaults to region default VPC which contains Internet Gateway also
  ingress {
    #For launching browser session into EC2(http)
    description      = "Allow traffic to dst port 80 inbound"
    from_port        = 80
    to_port          = 80
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]

  }
  ingress {
    #For launching SSH session into EC2(needed by terraform in order to execute cmds on EC2
    description      = "Allow traffic to dst port 22 inbound"
    from_port        = 22
    to_port          = 22
    protocol         = "tcp"
    cidr_blocks      = ["0.0.0.0/0"]

  }
/*By default, AWS creates an ALLOW ALL egress rule when creating a new Security Group inside of a VPC. When creating a new Security Group inside a VPC, Terraform will remove this default rule, and require you specifically re-create it if you desire that rule.
*/
  egress {
    #For downloading packages from Internet ie initiate a request to Internet outbound
    description      = "Allow traffic to dst port ANY outbound"
    from_port        = 0
    to_port          = 0
    protocol         = -1 #tcp/udp

    #NOTE-there seem to be connection issues for package download if just tcp 443 is #allowed outbound,hence doing any protocol allow outbound

    cidr_blocks      = ["0.0.0.0/0"]

  }

  tags = {
    Name = "my-sg"
  }
}

Proposal

Modification to both documents needed based on clarification.

+It may be noted that 'self' object explanation works as an alternative(perhaps this usage can be talked about in the document for provisioners): command = "echo ${aws_instance.myec2.private_ip} >> private_ips.txt"

CAN BE RE-WRITTEN AS (referencing above code snippet) /using 'self' object:

command = "echo ${self.private_ip} >> private_ips.txt "

Similarly, both the below syntaxes under ‘connection’ block(used with ‘remote-exec’ provisioners) are valid:

connection {
   type     = "ssh"
   user     = "ec2-user"
   private_key = file("./tf-key.pem")
   host     = self.public_ip # OR aws_instance.myec2.public_ip

    }

+It may also help to emphasize that self-object can be used with both local-exec and remote-exec provisioners.

References

No response

apparentlymart commented 1 year ago

Thanks for sharing this, @adi-garg.

I think the subtlety that this documentation isn't currently considering is that there's a difference between a resource (the resource block) and a resource instance, which we use to describe the effect of using count and for_each in a resource block.

For a resource that doesn't use count or for_each, self is exactly equivalent to referring to the one and only instance of that resource, because in that case the resource address and the resource instance address are identical.

When a resource uses count or for_each the situation is different, because self in that case refers to the current instance that is being provisioned. For example:

resource "example" "example" {
  count = 2

  connection {
    host = self.public_ip
    # ...
  }
}

In the above, the connection block will be evaluated twice. Each time self will be an alias for a different object: example.example[0] or example.example[1]. In this case it would not be valid to refer to example.example as a whole because that would create a dependency cycle, but it's valid to refer to self because Terraform knows for certain that it's a reference only to the current instance and that connection blocks always get evaluated only as part of applying that instance.

Internally Terraform treats self as literally a shorthand for some specific object in scope, so if it's valid to refer to self then it's also always valid to refer to exactly what self expands to. In the case of a single-instance resource self expands to the resource itself because that is synonymous with the resource's single instance, but for a multi-instance resource self expands to a reference to a particular instance and so referring to the resource as a whole is not valid.

There are some other subtleties here with how Terraform treats a constant instance key like example.example[0] differently than a dynamically-chosen one like example.example[count.index], but I think those details are not super important and it's better to focus on describing what self expands to in each case.

Thanks again for reporting this!