GSA-TTS / datagov-brokerpak-eks

Broker AWS EKS instances using the OSBAPI (eg from cloud.gov)
Other
9 stars 7 forks source link

Limit external-dns to exactly one zone #82

Closed mogul closed 2 years ago

mogul commented 2 years ago

See https://github.com/kubernetes-sigs/external-dns/pull/422

This also ensures that prerequisite_binaries_present works even if the binaries are available via aliases.

mogul commented 2 years ago

(I don't know why this closed.)

mogul commented 2 years ago

@nickumia-reisys,

I did a bunch of refactoring to separate the provision module into provision-aws and provision-k8s. Currently:

You might try pushing this to the develop branch to see if it deploys cleanly in development-ssb CF space / development TF workspace, and investigate if not.

However, we still have a ton of stuff provisioned in the production workspace under the former module, module.brokerpak-eks-terraform.module.provision, and it won't destroy without the kubernetes provider configured; it still complains about wanting to connect to 127.0.0.1. I don't know what to do about that... I guess we still just have to manually remove it...? It feels like there must be some other way, but I can't think of it. 😞

Still to do:

mogul commented 2 years ago

Note Hashicorp now has a support page about this situation.

Also I think maybe applying without refreshing may get us out of our current bind... Worth a shot!

nickumia-reisys commented 2 years ago

Relating to the apply without refresh... No dice 😞 image

mogul commented 2 years ago

However, we still have a ton of stuff provisioned in the production workspace under the former module, module.brokerpak-eks-terraform.module.provision, and it won't destroy without the kubernetes provider configured; it still complains about wanting to connect to 127.0.0.1. I don't know what to do about that... I guess we still just have to manually remove it...? It feels like there must be some other way, but I can't think of it. 😞

After banging my head against this for yet another day, I am almost out of things to try... This is the next thing:

Other than that, I think we're down to doing this in all three of development/staging/default workspaces and corresponding ENV_NAME:

nickumia-reisys commented 2 years ago

@mogul I started fixing this, but got stuck because I wasn't sure how to fix the problem of the provision-aws outputs being inputs to the provision-k8s module. Can you have more than one provision block and then pass from provision-1 to provision-2 and then to bind?

The above question is only about how to declare the service-definition.yml. If both provision-aws and provision-k8s inputs are put as inputs, it expects that terraform will receive all of inputs at once which isn't true with the double-module design. If we leave out the provision-aws outputs, then provision-k8s wouldn't know about them as inputs for itself. 😕

nickumia-reisys commented 2 years ago

Hmm.. so I got the brokerpak to build by using only the outputs from provision-k8s and only the inputs from provision-aws. I'm testing to see if terraform will like it.. but I guess this works 😖

nickumia-reisys commented 2 years ago

This is what I was afraid of.. Terraform is complaining that variables aren't declared as inputs because the brokerpak didn't pass it in as an input variable. But the brokerpak doesn't have access to those inputs yet because provision part 1 was not run yet,

nickumia@DL62-2-2MDD043:~/temp/datagov-brokerpak-eks$ make test
Provisioning aws-eks-service:raw:instance-nickumia
in progress | (0 seconds)...failed!
Error: Reference to undeclared input variable  on admin-account.tf line 53, in data "template_file" "admin_kubeconfig":  53:         certificate-authority-data: ${var.certificate_authority_data}An input variable with the name "certificate_authority_data" has not beendeclared. This variable can be declared with a variable"certificate_authority_data" {} block.Error: Reference to undeclared input variable  on admin-account.tf line 54, in data "template_file" "admin_kubeconfig":  54:         server: ${var.server}An input variable with the name "server" has not been declared. This variablecan be declared with a variable "server" {} block.Error: Reference to undeclared input variable  on external-dns-k8s.tf line 9, in resource "kubernetes_service_account" "external_dns":   9:       "eks.amazonaws.com/role-arn" = var.zone_role_arnAn input variable with the name "zone_role_arn" has not been declared. Thisvariable can be declared with a variable "zone_role_arn" {} block.Error: Reference to undeclared input variable  on persistent-storage-k8s.tf line 7, in resource "kubernetes_storage_class" "ebs-sc":   7:     kmsKeyId  = var.persistent_storage_key_idAn input variable with the name "persistent_storage_key_id" has not beendeclared. This variable can be declared with a variable"persistent_storage_key_id" {} block. exit status 1
mogul commented 2 years ago

Yeah I'm not sure how to deal with this yet.

mogul commented 2 years ago

I did some initial refactoring in the commits above. It's a little messy to symlink when you're working locally, but I think this approach may work since it approximates what we will do in the template_refs paths in the service definition to have everything work as before.

Not yet working, though... I still haven't got the kubernetes provider configured correctly, so it's still trying to talk to 127.0.0.1:80.

mogul commented 2 years ago

OK, this is pretty much back to where it was, as far as https://github.com/gsa/data.gov/issues/3706. All that remains is properly waiting for DNS to be resolvable before proceeding with tests.

mogul commented 2 years ago

Fixed the DNS resolution waiting-loop in test.sh. This would be passing if we hadn't somehow dropped the DS records from the data.gov domain when we moved the data.gov zone to Route53...! I will make a pull-request on 18f/dns in the morning to fix that.

mogul commented 2 years ago

Now blocked waiting on https://github.com/18F/dns/pull/593

mogul commented 2 years ago

That was quick! Now just waiting on the test to pass.

nickumia-reisys commented 2 years ago

Unless DNSSEC takes a long time to setup too based on DNS propagation, the test failed for no apparent reason.. Unless the dns server being requested to do dnssec validation was just a bad one 😕 image