department-of-veterans-affairs / va.gov-cms

Editor-centered management for Veteran-centered content.
https://prod.cms.va.gov
GNU General Public License v2.0
98 stars 69 forks source link

Prod and Test Prod Next.js preview servers should use https #17778

Closed timcosgrove closed 3 months ago

timcosgrove commented 6 months ago

Requirements

We want our Prod and Test Prod Next.js preview servers to use https, so that they can be reached securely and content from other domains can load. ```[tasklist] ### Acceptance criteria - [ ] Prod and Prod Test preview servers load using https without warning - [ ] All content loads correctly, including header, footer, images ``` ## Background & implementation details

Example: https://preview-prod.cms.va.gov/

results in this:

Image

jschmidt-civicactions commented 6 months ago

Note that https is using a self-signed cert right now (eg. https://github.com/department-of-veterans-affairs/vsp-infra-application-manifests/blob/main/apps/next-build-test/staging/templates/certificate.yaml).

I'm not sure offhand what steps we need to get a signed cert set up in the cluster, likely a question for platform devops to provide.

nfpappas-oddball commented 6 months ago

I don't know how it all works yet, but I would say either use ACM in amazon to do it, or change the self signed cert config to use letsencrypt to do a DNS challange.

https://aws.amazon.com/blogs/containers/serve-distinct-domains-with-tls-powered-by-acm-on-amazon-eks/

https://cert-manager.io/docs/tutorials/acme/nginx-ingress/

jschmidt-civicactions commented 6 months ago

Yeah it looks like Chrome just doesn't like self-signed certs:

image
jschmidt-civicactions commented 6 months ago

Also if I visit the test sites with Chrome, I get the same result. I wonder if you previously visited those and accepted the cert?

Either way, if we need to move these from self-signed certs this ticket could cover that effort.

nfpappas-oddball commented 6 months ago

OK so I just talked to kyle about this. If I understand the issue correctly, this is happening because : https://preview-prod.cms.va.gov/ is not going through the rev-proxy. So if my assumptions are correct then this would mean that anyone trying to view CMS needs to go through rev-proxy. That means we either need to get a cert from the VA for this or need to get VA to agree to use letsencrypt. If we use the same account we may be able to use the existing cert and just federate the url through the site. @timcosgrove @jschmidt-civicactions does that make sense and agree with your understanding of how its setup now?

nfpappas-oddball commented 5 months ago

Further investigation relieved that staging and prod have valid certs as long as you are within the VA network as pictured below. While the certs are different they are both valid as far as a va browser is concerned. I met with @hgbarreto to help me identify why the certs were different. We discovered that as it exist in the code found here that they should be the same. However when we look, the certs were different. We believe that it is some different configuration in prod vs staging kubernetes ingress (using traffik). It is therefore my recommendation that we leave the certificates alone as altering them could result in unexpected behavior.

nfpappas-oddball commented 5 months ago

Upon further exploration of this issue, I am now getting an error for preview-prod. I believe that this is something to do with traffik. We will need to work with platform team to figure this out.

nfpappas-oddball commented 4 months ago

Kyle was able to fix this. The actual issue was the certificates that are applied using to the ALB on the EKS cluster.We will need to dig more into this to understand it later