Open vsoch opened 1 year ago
If you need instances not on an autoscaling group, you can use a for_each
or count on the module.
if you can put them on an ASG then use the asg module.
okay let me put this together - so I'd take this part of that spec:
module "ec2_instance" {
source = "../../"
ssh_key_pair = module.aws_key_pair.key_name
vpc_id = module.vpc.vpc_id
subnet = module.subnets.private_subnet_ids[0]
security_groups = [module.vpc.vpc_default_security_group_id]
assign_eip_address = var.assign_eip_address
associate_public_ip_address = var.associate_public_ip_address
instance_type = var.instance_type
security_group_rules = var.security_group_rules
instance_profile = aws_iam_instance_profile.test.name
tenancy = var.tenancy
context = module.this.context
}
and reading about for_each I would do something like this?
module "ec2_instance" {
source = "../../"
# This would be a 4 node cluster
for_each = toset( ["0", "1", "2", "3"] )
# Where would I specify a name or a hostname?
name = "instance-${each.key}"
ssh_key_pair = module.aws_key_pair.key_name
vpc_id = module.vpc.vpc_id
subnet = module.subnets.private_subnet_ids[0]
security_groups = [module.vpc.vpc_default_security_group_id]
assign_eip_address = var.assign_eip_address
associate_public_ip_address = var.associate_public_ip_address
instance_type = var.instance_type
security_group_rules = var.security_group_rules
instance_profile = aws_iam_instance_profile.test.name
tenancy = var.tenancy
context = module.this.context
}
Or should I just set instance count to the number that I want? In which case, how would I be able to know there hostnames in advance? Thanks for the help! Sorry I'm new to this.
the hostname is something you will have to setup since the instances will pick the instance id as hostname i-dsad3424dsfds
if you are building a cluster that may or may not autoscale I will just use the asg module not this one. https://github.com/cloudposse/terraform-aws-ec2-autoscale-group
if you have the requirement to have separated and unique instances that are not that mutable then you could use this module for that with the for_each or count.
The cluster will have our job manager, Flux framework, and the different node hostnames need to be known at creation time (and we don't currently support any concept of scaling) so that's why I was looking to this config! The instances can be separated and not unique, but I do need to figure out how to, for example, get all the hostnames in the cluster for the main broker instance. As an example, when I do this in Kubernetes I use an Indexed Job, and then I know the hostname is something like:
flux-sample-0.flux-service.flux-operator.svc.cluster.local
And then the broker config just needs to know the shared DNS network (flux-service.flux-operator.svc.cluster.local) and then the range of hosts (e.g., flux-sample[0-4]). When we set this up in terraform with GCP (and I didn't create these recipes so I understand them superficially) I think we set the hostnames via a variable that goes into metadata for the instance https://github.com/GoogleCloudPlatform/scientific-computing-examples/blob/b6995a84ba084bd55e08d3e09d9a1b8e6715db65/fluxfw-gcp/tf/modules/compute/main.tf#L72-L80 and then during startup we can ping that API to get the exact value. Does that make sense? I'm looking for something similar / simple here - I basically just need a set number of instances (no auto-scaling) that I can somehow retrieve the hostnames for to write into the cluster configs.
What about this recipe / setup under this same org? I see there is a DNS -> hostname https://github.com/clouddrove/terraform-aws-ec2/blob/edcac308fa17b8135b7813e643c2f7448306c01e/main.tf#L193
it all depends on what you are doing, I have no idea about your requirements so I can only assume.
if the software you are using requires a valid domain name record to point to the instance then you can use that. if it uses the hostname then you can add a userdata script that sets the hostname in a predictable format.
#cloud-config
instanceid=$(curl -s http://169.254.169.254/latest/meta-data/instance-id | sed 's/i-//g')
hostnamectl set-hostname "cassandra-$${instanceid}.${namespace}-${environment}"
echo "cassandra-$${instanceid}.${namespace}-${environment}" > /etc/hostname
hostname -F /etc/hostname
okay I've been able to use the recipe to deploy two nodes, and I'm looking at the metadata above!
instanceid=$(curl -s http://169.254.169.254/latest/meta-data/instance-id | sed 's/i-//g')
[rocky@ip-172-16-215-227 ~]$ echo $instanceid
0b5058dd4d3be6142
Stupid question - where did you get http://169.254.169.254 from (because it works for me too). I'm assuming the above would be able to set a hostname based on the instance id (which I can't control?) from within a single node, but I would not be able to request getting a set of instance ids? And if I were to setup the DNS section of the config (that I linked to above) that would be done manually on S3 first, and then do the instances get a predictable name?
I think the requirements are fairly loose - I just need an ip address / hostname that one instance can see for the other. I am fairly indifferent about how that is accomplished. If we could walk through the logic of one simple way, I'd be really appreciative!
For the above - if we know the instance id (or some other metadata variable) that is unique to the instance across all instances, this would actually be perfect. I would set up a user data section that can run on each to generate an updated hostname and add the known hostnames to the other nodes generated. They would be added to /etc/hosts. If we created the equivalent of a headless service, then I would still need to know the hostnames for the broker, but wouldn't need to add to /etc/hosts because they would be discovered via DNS.
Maybe I could try setting user-data and seeing if it shows up in this list:
And then part of the user data could (somehow) be the count (the for_each part?) although I don't fully understand how that relates to the explicit instance_count (or maybe I could pipe that index in?)
okay I've been able to use the recipe to deploy two nodes, and I'm looking at the metadata above!
instanceid=$(curl -s http://169.254.169.254/latest/meta-data/instance-id | sed 's/i-//g') [rocky@ip-172-16-215-227 ~]$ echo $instanceid 0b5058dd4d3be6142
Stupid question - where did you get http://169.254.169.254 from (because it works for me too). I'm assuming the above would be able to set a hostname based on the instance id (which I can't control?) from within a single node, but I would not be able to request getting a set of instance ids? And if I were to setup the DNS section of the config (that I linked to above) that would be done manually on S3 first, and then do the instances get a predictable name?
I think the requirements are fairly loose - I just need an ip address / hostname that one instance can see for the other. I am fairly indifferent about how that is accomplished. If we could walk through the logic of one simple way, I'd be really appreciative!
For the above - if we know the instance id (or some other metadata variable) that is unique to the instance across all instances, this would actually be perfect. I would set up a user data section that can run on each to generate an updated hostname and add the known hostnames to the other nodes generated. They would be added to /etc/hosts. If we created the equivalent of a headless service, then I would still need to know the hostnames for the broker, but wouldn't need to add to /etc/hosts because they would be discovered via DNS.
you can template the userdata to pass whatever string to it instead of the instanceid, I use that in my case.
do you need the ips as immutable?
do you need the ips as immutable?
They only need to be consistent for the lifecycle of a single cluster deployment - the general design is we bring it up, use it (and knowing the ips for a single broker in the cluster allows them to see one another) and then we throw away. We can bring up another one later with completely different ones.
I will say use the asg module instead if that is the case
Can you tell me how to get the index of the instance for the launch script? I tried setting for each, but it doesn't seem to use it - it tries to create the same one twice (and then tells me it already exists). I found this example https://www.middlewareinventory.com/blog/terraform-aws-ec2-user_data-example/ that has a count.index that it starts at 1 but I don't understand how that's working.
E.g., I can see there are indices here: module.ec2["2"].aws_iam_instance_profile.default[0]
Even if I could get an index though, I don't know how to get the instances to see one another. E.g., I manually set one of two to flux-1, but flux-1 cannot see flux-0
[rocky@flux-0 ~]$ ping flux-1
PING flux-1(flux-1 (fe80::10f9:24ff:feb0:836b%eth0)) 56 data bytes
64 bytes from flux-1 (fe80::10f9:24ff:feb0:836b%eth0): icmp_seq=1 ttl=64 time=0.040 ms
64 bytes from flux-1 (fe80::10f9:24ff:feb0:836b%eth0): icmp_seq=2 ttl=64 time=0.037 ms
64 bytes from flux-1 (fe80::10f9:24ff:feb0:836b%eth0): icmp_seq=3 ttl=64 time=0.034 ms
^C
--- flux-1 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2057ms
rtt min/avg/max/mdev = 0.034/0.037/0.040/0.002 ms
[rocky@flux-0 ~]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
[rocky@flux-0 ~]$ ping flux-0
ping: flux-0: Name or service not known
I will say use the asg module instead if that is the case
The autoscaling module is using kubernetes? We already have a flux operator (for Kubernetes) and we are looking to deploy on bare VMs, hence this approach!
asg module is for instances only.
On Thu, May 11, 2023, 5:42 p.m. Vanessasaurus @.***> wrote:
I will say use the asg module instead if that is the case
The autoscaling module is using kubernetes? We already have a flux operator (for Kubernetes) and we are looking to deploy on bare VMs, hence this approach!
— Reply to this email directly, view it on GitHub https://github.com/cloudposse/terraform-aws-ec2-instance/issues/155#issuecomment-1544941565, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAQ3ERBKRASB3RYBD6PHGW3XFWBQNANCNFSM6AAAAAAX6TOCKA . You are receiving this because you commented.Message ID: @.***>
Do you know what this tag is for?
okay I figured out how to get for_each working, and with custom variables. If I do:
locals {
multiple_instances = {
one = {
instance_type = "m4.large"
}
two = {
instance_type = "m4.large"
}
three = {
instance_type = "m4.large"
}
}
}
then in the block we were talking about earlier:
for_each = local.multiple_instances
and then I have ${each.key}
available as a variable (e.g, to set a hostname) but I run into an error that the instance profile was already created, so I turned this off? Not sure what the implications of that are:
instance_profile_enabled = false
But that seemed to bring up the machines, and now they have at least unique hostnames! So the last step is figuring out how to get them to see one another... should I try enabling the DNS?
I tried enabling DNS, making a route 63 to get a zone id, and changing the hostnames there to match my instances. They were created correctly, but I can't seem to ping any of them from any instance. What am I missing? Also note I keep seeing this deprecation notice:
╷
│ Warning: Argument is deprecated
│
│ with module.vpc.aws_vpc.default[0],
│ on .terraform/modules/vpc/main.tf line 29, in resource "aws_vpc" "default":
│ 29: enable_classiclink = var.enable_classiclink
│
│ With the retirement of EC2-Classic the enable_classiclink attribute has been deprecated and will be removed in a future
│ version.
│
│ (and one more similar warning elsewhere)
Going to try the autoscale recipe now - I might be epically failing but at least I'm learning little bits along the way! :laughing:
Okay one issue for the other repos (let me know if you want me to report them there or if they belong with a submodule)
│ Error: Error in function call
│
│ on .terraform/modules/subnets/outputs.tf line 53, in output "nat_ips":
│ 53: value = coalescelist(aws_eip.default.*.public_ip, aws_eip.nat_instance.*.public_ip, data.aws_eip.nat_ips.*.public_ip, list(""))
│ ├────────────────
│ │ while calling list(vals...)
│
│ Call to function "list" failed: the "list" function was deprecated in Terraform v0.12 and is no longer available; use tolist([ ... ])
│ syntax to write a literal list.
This looks promising, but I'm not able to ssh in. It looks like the recipe there takes "security_groups" but that is actually linked to security group ids (not a spec for groups). Is it possible to add the creation / association of a security group with port 22 open to the autoscale spec? So I don't have to do it manually?
oh wait I think I can figure this out!! Be back - will try after dinner. Sorry having fun :)
okay! (lol still working on this!) I was able to put pieces together, and I think I have an instance group plus security groups that actually work to allow me to ssh in: https://github.com/converged-computing/flux-terraform-ami/pull/1/files#diff-6e45d26e502f88302f69c4c196babd8939186d9cd298f94caca283c128a2d186.
You can read the description of that PR - the next step (and really the final one I need help with) is to understand how to refer to the different nodes on the network, and how to predict the names so I can put into the user data start script to ensure the broker is ready. I can see that I have a hostname (I didn't do anything yet in userdata to change it):
$ hostname
ip-172-16-100-10.ec2.internal
/etc/hosts
isn't populated with anything of interest:
[rocky@ip-172-16-100-10 ~]$ cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
But I think the DNS lookup is setup in /etc/resolv.conf
[rocky@ip-172-16-100-10 ~]$ cat /etc/resolv.conf
# Generated by NetworkManager
search ec2.internal
nameserver 172.16.0.2
So TLDR: I am hopeful that if we can figure out setting a unique hostname (some count or index from your autoscale module) I can set that, figure out how to refer to it for DNS to resolve, and then be able to write that predictably into the flux broker config file (and everything should work if the networking is good!) And hopefully with this config there is something more concrete for you to see and work with! And I found https://github.com/meltwater/terraform-aws-asg-dns-handler in case that gives us hints about the hostnames.
I think this could work if we are able to define a lifecycle and use that module I pointed out... still trying.
Is there any way I can reference an index in the user data? I'm really struggling to get anything working.
Describe the Feature
I'd like to know how to modify this example https://github.com/cloudposse/terraform-aws-ec2-instance/blob/master/examples/complete/main.tf for multiple EC2 instances
Expected Behavior
NA
Use Case
Bringing up a small networked cluster
Describe Ideal Solution
A similar example in the examples folder for multiple networked EC2 instances from custom AMI.
Alternatives Considered
No response
Additional Context
No response