Closed ianhundere closed 2 years ago
plans dev
Acquiring state lock. This may take a few moments...
data.aws_subnet.subnet_id_a: Reading...
data.aws_subnet.subnet_id_c: Reading...
data.aws_autoscaling_group.selected: Reading...
data.aws_subnet.subnet_id_b: Reading...
data.aws_launch_template.selected: Reading...
data.aws_elb.selected: Reading...
data.aws_subnet.subnet_id_b: Read complete after 0s [id=subnet-8d3dd6e9]
data.aws_subnet.subnet_id_c: Read complete after 0s [id=subnet-2c4e176a]
data.aws_subnet.subnet_id_a: Read complete after 0s [id=subnet-d6f512a0]
data.aws_launch_template.selected: Read complete after 1s [id=lt-0dde36914873ae8b6]
data.aws_autoscaling_group.selected: Read complete after 1s [id=dsva-vagov-dev-deployment-vagov-dev-vets-api-server-20220810-194214-asg]
data.aws_elb.selected: Read complete after 1s [id=dsva-vagov-dev-vets-api-elb]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_autoscaling_group.vets-api-server will be created
+ resource "aws_autoscaling_group" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ default_cooldown = (known after apply)
+ desired_capacity = (known after apply)
+ force_delete = false
+ force_delete_warm_pool = false
+ health_check_grace_period = 120
+ health_check_type = "ELB"
+ id = (known after apply)
+ max_size = 6
+ metrics_granularity = "1Minute"
+ min_size = 3
+ name = "dsva-vagov-dev-deployment-vagov-dev-vets-api-server-20220810-194214-asg-evss-bgs-split"
+ name_prefix = (known after apply)
+ protect_from_scale_in = false
+ service_linked_role_arn = (known after apply)
+ termination_policies = [
+ "OldestLaunchTemplate",
]
+ vpc_zone_identifier = [
+ "subnet-2c4e176a,subnet-d6f512a0,subnet-8d3dd6e9",
]
+ wait_for_capacity_timeout = "10m"
+ instance_refresh {
+ strategy = "Rolling"
+ preferences {
+ min_healthy_percentage = 50
+ skip_matching = false
}
}
+ launch_template {
+ id = "lt-0dde36914873ae8b6"
+ name = (known after apply)
+ version = "$Default"
}
}
# aws_elb.vets-api-server will be created
+ resource "aws_elb" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ connection_draining = true
+ connection_draining_timeout = 30
+ cross_zone_load_balancing = false
+ desync_mitigation_mode = "defensive"
+ dns_name = (known after apply)
+ id = (known after apply)
+ idle_timeout = 120
+ instances = (known after apply)
+ internal = true
+ name = "dsva-vagov-dev-vets-api-bgs"
+ security_groups = [
+ "sg-9eaa28f9",
]
+ source_security_group = (known after apply)
+ source_security_group_id = (known after apply)
+ subnets = [
+ "subnet-2c4e176a",
+ "subnet-8d3dd6e9",
+ "subnet-d6f512a0",
]
+ tags_all = {
+ "application" = "vets-api"
+ "environment" = "dev"
+ "managed_by" = "Terraform"
+ "purpose" = "mitigate latency issues as per https://dsva.slack.com/archives/C03STQZ40DQ"
+ "repo" = "https://github.com/department-of-veterans-affairs/vsp-infra-evss-bgs-split"
}
+ zone_id = (known after apply)
+ health_check {
+ healthy_threshold = 3
+ interval = 30
+ target = "HTTP:3004/"
+ timeout = 5
+ unhealthy_threshold = 2
}
+ listener {
+ instance_port = 3004
+ instance_protocol = "HTTP"
+ lb_port = 3004
+ lb_protocol = "HTTP"
}
}
Plan: 2 to add, 0 to change, 0 to destroy.
staging
> tf plan
Acquiring state lock. This may take a few moments...
data.aws_subnet.subnet_id_a: Reading...
data.aws_subnet.subnet_id_c: Reading...
data.aws_elb.selected: Reading...
data.aws_subnet.subnet_id_b: Reading...
data.aws_autoscaling_group.selected: Reading...
data.aws_launch_template.selected: Reading...
data.aws_subnet.subnet_id_a: Read complete after 0s [id=subnet-70f51206]
data.aws_subnet.subnet_id_c: Read complete after 0s [id=subnet-b84e17fe]
data.aws_subnet.subnet_id_b: Read complete after 0s [id=subnet-cc3cd7a8]
data.aws_autoscaling_group.selected: Read complete after 0s [id=dsva-vagov-staging-deployment-vagov-staging-vets-api-server-20220810-194219-asg]
data.aws_launch_template.selected: Read complete after 1s [id=lt-0301a333d650b33b1]
data.aws_elb.selected: Read complete after 1s [id=dsva-vagov-staging-vets-api-elb]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_autoscaling_group.vets-api-server will be created
+ resource "aws_autoscaling_group" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ default_cooldown = (known after apply)
+ desired_capacity = (known after apply)
+ force_delete = false
+ force_delete_warm_pool = false
+ health_check_grace_period = 120
+ health_check_type = "ELB"
+ id = (known after apply)
+ max_size = 12
+ metrics_granularity = "1Minute"
+ min_size = 6
+ name = "dsva-vagov-staging-deployment-vagov-staging-vets-api-server-20220810-194219-asg-evss-bgs-split"
+ name_prefix = (known after apply)
+ protect_from_scale_in = false
+ service_linked_role_arn = (known after apply)
+ termination_policies = [
+ "OldestLaunchTemplate",
]
+ vpc_zone_identifier = [
+ "subnet-cc3cd7a8,subnet-b84e17fe,subnet-70f51206",
]
+ wait_for_capacity_timeout = "10m"
+ instance_refresh {
+ strategy = "Rolling"
+ preferences {
+ min_healthy_percentage = 50
+ skip_matching = false
}
}
+ launch_template {
+ id = "lt-0301a333d650b33b1"
+ name = (known after apply)
+ version = "$Default"
}
}
# aws_elb.vets-api-server will be created
+ resource "aws_elb" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ connection_draining = true
+ connection_draining_timeout = 30
+ cross_zone_load_balancing = false
+ desync_mitigation_mode = "defensive"
+ dns_name = (known after apply)
+ id = (known after apply)
+ idle_timeout = 120
+ instances = (known after apply)
+ internal = true
+ name = "dsva-vagov-staging-vets-api-bgs"
+ security_groups = [
+ "sg-854cc6e2",
]
+ source_security_group = (known after apply)
+ source_security_group_id = (known after apply)
+ subnets = [
+ "subnet-70f51206",
+ "subnet-b84e17fe",
+ "subnet-cc3cd7a8",
]
+ tags_all = {
+ "application" = "vets-api"
+ "environment" = "staging"
+ "managed_by" = "Terraform"
+ "purpose" = "mitigate latency issues as per https://dsva.slack.com/archives/C03STQZ40DQ"
+ "repo" = "https://github.com/department-of-veterans-affairs/vsp-infra-evss-bgs-split"
}
+ zone_id = (known after apply)
+ health_check {
+ healthy_threshold = 3
+ interval = 30
+ target = "HTTP:3004/"
+ timeout = 5
+ unhealthy_threshold = 2
}
+ listener {
+ instance_port = 3004
+ instance_protocol = "HTTP"
+ lb_port = 3004
+ lb_protocol = "HTTP"
}
}
Plan: 2 to add, 0 to change, 0 to destroy.
prod
Acquiring state lock. This may take a few moments...
data.aws_subnet.subnet_id_a: Reading...
data.aws_subnet.subnet_id_b: Reading...
data.aws_elb.selected: Reading...
data.aws_launch_template.selected: Reading...
data.aws_subnet.subnet_id_c: Reading...
data.aws_autoscaling_group.selected: Reading...
data.aws_subnet.subnet_id_a: Read complete after 0s [id=subnet-f3f31485]
data.aws_subnet.subnet_id_b: Read complete after 0s [id=subnet-a433d8c0]
data.aws_subnet.subnet_id_c: Read complete after 0s [id=subnet-66411820]
data.aws_launch_template.selected: Read complete after 0s [id=lt-06c6752970f3ee877]
data.aws_autoscaling_group.selected: Read complete after 1s [id=dsva-vagov-prod-deployment-vagov-prod-vets-api-server-20220810-190140-asg]
data.aws_elb.selected: Read complete after 1s [id=dsva-vagov-prod-vets-api-elb]
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
Terraform will perform the following actions:
# aws_autoscaling_group.vets-api-server will be created
+ resource "aws_autoscaling_group" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ default_cooldown = (known after apply)
+ desired_capacity = (known after apply)
+ force_delete = false
+ force_delete_warm_pool = false
+ health_check_grace_period = 120
+ health_check_type = "ELB"
+ id = (known after apply)
+ max_size = 32
+ metrics_granularity = "1Minute"
+ min_size = 16
+ name = "dsva-vagov-prod-deployment-vagov-prod-vets-api-server-20220810-190140-asg-evss-bgs-split"
+ name_prefix = (known after apply)
+ protect_from_scale_in = false
+ service_linked_role_arn = (known after apply)
+ termination_policies = [
+ "OldestLaunchTemplate",
]
+ vpc_zone_identifier = [
+ "subnet-f3f31485,subnet-66411820,subnet-a433d8c0",
]
+ wait_for_capacity_timeout = "10m"
+ instance_refresh {
+ strategy = "Rolling"
+ preferences {
+ min_healthy_percentage = 50
+ skip_matching = false
}
}
+ launch_template {
+ id = "lt-06c6752970f3ee877"
+ name = (known after apply)
+ version = "$Default"
}
}
# aws_elb.vets-api-server will be created
+ resource "aws_elb" "vets-api-server" {
+ arn = (known after apply)
+ availability_zones = (known after apply)
+ connection_draining = true
+ connection_draining_timeout = 30
+ cross_zone_load_balancing = false
+ desync_mitigation_mode = "defensive"
+ dns_name = (known after apply)
+ id = (known after apply)
+ idle_timeout = 120
+ instances = (known after apply)
+ internal = true
+ name = "dsva-vagov-prod-vets-api-bgs"
+ security_groups = [
+ "sg-2d6cf84a",
]
+ source_security_group = (known after apply)
+ source_security_group_id = (known after apply)
+ subnets = [
+ "subnet-66411820",
+ "subnet-a433d8c0",
+ "subnet-f3f31485",
]
+ tags_all = {
+ "application" = "vets-api"
+ "environment" = "prod"
+ "managed_by" = "Terraform"
+ "purpose" = "mitigate latency issues as per https://dsva.slack.com/archives/C03STQZ40DQ"
+ "repo" = "https://github.com/department-of-veterans-affairs/vsp-infra-evss-bgs-split"
}
+ zone_id = (known after apply)
+ health_check {
+ healthy_threshold = 3
+ interval = 30
+ target = "HTTP:3004/"
+ timeout = 5
+ unhealthy_threshold = 2
}
+ listener {
+ instance_port = 3004
+ instance_protocol = "HTTP"
+ lb_port = 3004
+ lb_protocol = "HTTP"
}
}
Plan: 2 to add, 0 to change, 0 to destroy.
the dns_name
will still need to be added to the route53 record still, but i'll do that manually since we don't wanna mess with state if we can help it.
ran into an issues where the asg name keeps changing due to deploys, i'm now filtering based off of some tags while using the aws_autoscaling_groups
datasource to grab the latest asg name.
currently we're at the following:
dns_name
for the lb to the route53 record,nginx_config_bgs_and_envss_split_api_url: ""
https://github.com/department-of-veterans-affairs/devops/pull/11782/files ^ jenkins job has been added, just needs to be tested.
jenkins job has been tested, just coordinating when to flip all the switches for staging.
the plan:
dns_name
(internal-dsva-vagov-staging-vets-api-2nd-445860677.us-gov-west-1.elb.amazonaws.com
) to
we’ll only do staging for this piece in order to test jenkins job
then y’all test staging and if all is well i’ll make the necessary changes to the rev proxy for both dev / prod
currently have a test jenkins job, will remove this when the jenkins pr is merged / closed: http://jenkins.vfs.va.gov/job/testing/job/testing_tf/
edit: test job removed / pr merged.
i'll keep this ticket open until the remainder changes for the rev proxy / jenkins job have been completed / merged.
manually ran the revproxy deploy job for staging as well as the seed job.
sh-4.2$ cat /usr/local/openresty/nginx/sites-enabled/api_server.conf | grep bgs
# bgs and evss routes
revproxy is updated
https://github.com/department-of-veterans-affairs/vsp-infra-evss-bgs-split/pull/1 ^ tags / asg attachment added
and nginx config fixed: https://github.com/department-of-veterans-affairs/devops/commit/280980b07dcbe9131ceb8b15a557716b7e1b9bf2
confirmed staging revproxy is correctly updated and that ec2s are being registered with LB:
sh-4.2$ cat api_server.conf | grep -A 4 bgs
# bgs and evss routes
location ~ ^/(v0|v1)(/debts|/debt_letters|/profile/ch33_bank_accounts|/profile/payment_history) {
proxy_pass http://internal-dsva-vagov-staging-vets-api-2nd-445860677.us-gov-west-1.elb.amazonaws.com:3004$request_uri;
jenkins job is fixed: https://github.com/department-of-veterans-affairs/devops/pull/11803/files
reaching out to various parties in regards to moving / renaming the dd forwarder before enabling logs in staging / prod.
jenkins job confirmed to be working:
jenkins job confirmed to be working:
we’ve discovered, at least this is what it looks like, traffic isn’t going thru the revproxy. Kyle was just as confused and sanity checked it. we don’t know where the revproxy is coming into play because all records pointed straight to the asg worker lbs. we’re gonna pull Jeremy in to see if he knows where the revproxy comes into play.
https://dsva.slack.com/archives/CTYQL39FE/p1660849458919659
edit: endpoints are being used, infra will investigate further.
oops, dev/prod failed because of not including proper schema / port. monday woes
https://dsva.slack.com/archives/CJYRZK2HH/p1661185087672539
and we're a go!
since our prometheus metrics rely on tags, tags were added to the 2nd
asg instances since the launch template doesn't include them, specifically deployment_name
.
https://dsva.slack.com/archives/C03KT515C0H/p1661263063451239
provider "aws" {
region = "us-gov-west-1"
default_tags {
tags = {
Name = "${var.name}-2nd"
deployment_name = "vets-api-server"
repo = "https://github.com/department-of-veterans-affairs/vsp-infra-evss-bgs-split"
managed_by = "Terraform"
application = "vets-api"
purpose = "mitigate latency issues as per https://dsva.slack.com/archives/C03STQZ40DQ"
environment = var.env
}
}
}
enables access_logs for each respective elb.
Description
As part of the #vets-api-latency-issue-aug08 effort, this ticket will duplicate much of the infra that
vets-api
relies on to then route problematic evss-bgs routes to the duplicated ASGs/LBs. This ticket is concerned with the duplication of infra, custom jenkins job to plan/apply duplicated resources, and revproxy changes.Acceptance Criteria