GCTC-NTGC / gc-digital-talent

GC Digital Talent is the new recruitment platform for digital and tech jobs in the Government of Canada. // Talents numériques du GC est la nouvelle plateforme de recrutement pour les emplois numériques et technologiques au gouvernement du Canada.
https://talent.canada.ca
GNU Affero General Public License v3.0
22 stars 8 forks source link

✨ Configurable whitelist for government emails stored in env variable #10738

Closed gobyrne closed 2 months ago

gobyrne commented 4 months ago

Details

To ensure people are using government of canada work emails, we need to have a whitelist for them.

In case we exclude some folks by mistake, we also want a way to update the whitelist without deploying.

Initially, our whitelist should include the following domains:

Proposed Implementation

Given that we want a whitelist which can be updated quickly without a full code deployment, our current system essentially limits us to using an env variable.

Therefor: define a new environment variable which stores a regex. Emails will be considered valid government emails IF and ONLY IF they match this regex.

As part of this ticket, ensure the env variable is available in frontend code as well, since in the future it will be used for both frontend and backend validation.

I recommend this regex for now: .+@(.+\.)?(canada|gc|elections)\.ca$

Acceptance criteria

Deployment

Add env variable with initial regex value to IaC

mnigh commented 4 months ago

The guidance seems to be *.gc.ca and *.canada.ca based on: https://www.canada.ca/en/government/system/digital-government/policies-standards/enterprise-it-service-common-configurations/email.html#cha2:~:text=Security%20Categorization.-,2.%20Naming%20conventions%20for%20domains,An%20email%20address%20using%20the%20%40canada.ca%20domain%20(name%40canada.ca).,-2.2%20The%20email. I have not been able to find any specific examples of individual email addresses that don't meet those criteria.

The regex would need to include no subdomains and then any number of subdomains:

I have found consulates using domains that don't follow these patterns (ex. Embassy of Canada in Estonia), though the employees there would likely have international.gc.ca emails.

petertgiles commented 3 months ago

@gobyrne Matt has a suggestion here that you could consider. Beyond that, I'm not sure what else a dev could do. This may be a manager/policy question more than anything else.

brindasasi commented 3 months ago

@gobyrne will turn this spike into issue with usecases mentioned to it.

mnigh commented 3 months ago

Here is another domain that would include Government of Canada employees: *@elections.ca.

tristan-orourke commented 3 months ago

Translated our domains into Regex

Basic regex (every domain split by ORs): .+@canada\.ca|.+@.+\.canada\.ca|.+@gc\.ca|.+@.+\.gc\.ca|.+@elections\.ca|.+@.+\.elections\.ca

Condensed/simplified regex: .+@(.+\.)?(canada|gc|elections)\.ca$

Tested at https://regex101.com/

mnigh commented 3 months ago

In case we exclude some folks by mistake, we also want a way to update the whitelist without deploying.

Given that we want a whitelist which can be updated quickly without a full code deployment...

The idea that changes could be made to the regex without any regression tests seems a bit of a risk. Shouldn't there be a test for email validation, especially if changes are being made to a regular expression that could have wide-ranging consequences? To the point where having an environment variable makes deployment not necessary, wouldn't a change to the regex even in an environment still require a deployment anyway?

brindasasi commented 3 months ago

We are adding into code instead of env. var. We would need a test to validate the regex patterns.

brindasasi commented 3 months ago

This ticket will be combined with the actual work email field #10363 . This can be closed by itself.