cds-snc / notification-planning

Project planning for GC Notify Team
4 stars 0 forks source link

Failed certificate validation on API subdomain when using Notify API #624

Closed jimleroyer closed 2 years ago

jimleroyer commented 2 years ago

Describe the bug

A user from the API makes a request to a secured HTTPS connection to the Notify API and the security certificate fails validation, hence making usage of Notify API impossible in a secured way.

From a reported issue:

Weblogic is rejecting the wildcarded notification.canada.ca certificate on the API hostname: <Certificate chain received from proxy.omega.dce-eir.net - --> api.notification.canada.ca failed hostname verification check. Certificate contained notification.canada.ca but check expected api.notification.canada.ca> Is it possible to explicitly add api.notification.canada.ca (non-wildcarded) as a subject alternative name on the certificate?

Bug Severity

SEV2, SEV3 -- need to discuss with team as impact might depend

To Reproduce

Steps to reproduce the behavior:

  1. Make a request to Notify API with certificate validation enabled. Potentially, it might happen only with some library that makes more exhaustive checks such as WebLogic as described by the reporting user in Freshdesk.

Expected behavior

No certificate validation fails when communicating with Notify API.

Impact

I am unsure what's the real impact, as only one user reported it so far.

Impact on Notify users: Not being able to use the Notify API without disabling the security certificate validation.

Impact on Recipients: Potentially notifications that were not sent all the way through with the best security measure in place. Less trust if discovered and exploited.

Impact on Notify team: Potentially less trust when something bad happens.

Additional context

From @mohamed-cds

this being straightforward as updating the acm terraform to add a new alt name.

subject_alternative_names = [
    "*.${var.domain}",
    "*.document.${var.domain}"
  ]
mohdnr commented 2 years ago

just a heads up that subject_alternative_names appears twice in the code, so both locations would have to be updated. An even better solution would be to find a way to use a common local variable so that it doesn't need to be updated in multiple places.

jimleroyer commented 2 years ago

Client responded back on the ticket, saying they might need this sooner than later. Opened a PR for that to test it out.

jimleroyer commented 2 years ago

Hey team! Please add your planning poker estimate with ZenHub @mohamed-cds @sastels

yaelberger-commits commented 2 years ago

If the size of this task is relatively small, we could bring this in for the next sprint to tackle within the next 3 weeks to meet the client's needs.

jimleroyer commented 2 years ago

We tested the change in staging and it didn't work as expected (and estimated -- this will be more points). This requires the SRE team intervention. @mohamed-cds kindly volunteered to help us on that one, bringing a solution on the table that will unblock us on this. There is no hurry as we'd have 1 month but they might have time for us this week to push forward.

mohdnr commented 2 years ago

What i propose is we split this into two tasks:

  1. Getting the current cert update into production
  2. Refactoring terraform so that alt names can be added without requiring manual intervention

What we did for Item #1:

jimleroyer commented 2 years ago

After a discussion with Mohamed, we will drop the #2 refactoring task as we rarely update these. If that comes up again in a near future, we'll consider it.

jimleroyer commented 2 years ago

This is planned to go in production on July 21th, 2022, 16h30 EST.

adriannelee commented 2 years ago

Jimmy:

I am currently working on the WAF rules, the card is in progress, no blocker at the moment! Also, the task to upgrade the certificate might be blocked at the moment, the user came back to say it does not work. I think the software he is using (i.e. WebLogic) might need some tweak so I'll contact him again on Freshdesk.

adriannelee commented 2 years ago

Blocked by client and their configuration. We'll need to negotiate how much we change our configuration vs them changing theirs.

jimleroyer commented 2 years ago

Waiting for users to reply back as per previous Adrianne's comment. We'll see what they say on the suggestion we have for them with a tweak to their configuration and then concert on options.

jimleroyer commented 2 years ago

User replied back and proposed to add a new entry in addition to the wildcard one, so basically wildcard + non-wildcard. Spoke to Max and he said let's try this on. https://github.com/cds-snc/notification-terraform/pull/500

jimleroyer commented 2 years ago

We pushed new certificates in staging and will make a push in production on August 2 in the morning. CC ops lead @sastels

yaelberger-commits commented 2 years ago

Release tomorrow morning

jzbahrai commented 2 years ago

@sastels will Qa

sastels commented 2 years ago

LGTM. Suggest marking this as Done until / unless we hear back from the user that they're still having issues.