Open jimleroyer opened 4 months ago
Starting work on this today! 🕺
Clickops'd a pinpoint pool in staging with a new long code. Can send with it with
aws pinpoint-sms-voice-v2 send-text-message --destination-phone-number +<number> --origination-identity pool-090efa2fefab4cb5b10fa12125998266 --message-body "hi2me from pinpoint pool"
Started roughing in some pinpoint code: https://github.com/cds-snc/notification-api/pull/2152
@sastels to look into creating the pinpoint infrastructure with Terraform
Sadly it appears that terraform does not yet support Pinpoint V2 (including dealing with pools)
AWS_PINPOINT_TEMPLATE_IDS
Pinpoint could possibly be implemented with CloudFormation instead.
pinpoint-sms-voice-v2 added on 2022-03-31
Looks like it's not supported in terraform nor cloudformation yet.
All that IAM stuff in the PR actually works and I can send through a pinpoint pool and see it appear in the logs (tested in dev)! And in the logs we even get the actual phone number used to send.
made a log group for pinpoint failures too and have failures redirected there.
Ok! So we have
Some parts of some of this have been tested locally or on dev merging any or all of these in should not affect how the system runs, ie the old SNS flow is still there and unchanged. The pinpoint bits will only be used if we set the AWSPINPOINT* variables in api and send using one of the designated templates.
work continues...
reworked terraform PR a bit more, bootstrapped the ecr repo on dev, and got the lambda applying as well. Ready for another look through by Ben.
TF PR is ready for another look for infrastructure review. After that, Steve will play with the API and polish the PR. We will need to test in staging environment afterward with preconfigured pools of codes.
TF merged and looks good in staging. Can use AWS CLI to send an email to the new pool and the delivery gets logged, lambda triggered, and task scheduled.
Next step is finishing the api PR that contains the new pinpoint receipt processing task and logic to determine whether to send via pinpoint or not
api PR looks good, just going to give it another look over then it'll be ready for review.
api pr ready for review https://github.com/cds-snc/notification-api/pull/2152
Jimmy and team to review the PR.
Steve has the puck... he. turned it over at the blue line now it's back in progress
Still have a few of Jimmy's comments to address
Steve will get back to this this week to address Jimmy's comments
I think this is ready for Jimmy to take another look! https://github.com/cds-snc/notification-api/pull/2152
Another PR! https://github.com/cds-snc/notification-manifests/pull/2558
Update: PR merged into staging.
API PR approved. Will merge tomorrow after release and test if staging still works (ie with empty variables). If yes, the PR is ready for release.
Then will sent the staging variables to the shortcode pool and template and test with templates in and not in the shortcode template list.
Started PR for useing Pinpoint for non-shortcode templates as well https://github.com/cds-snc/notification-api/pull/2173 (WIP)
api PR merged into staging and tested. Nothing's different :tada:
Now will turn on PinPoint in staging and see if all these new tasks and lambdas work together 😬 https://github.com/cds-snc/notification-manifests/pull/2602
alrighty! tested in staging.
[2024-05-08 19:35:09,029: ERROR/ForkPoolWorker-2] SMS notification delivery for id: 81af7a26-f09d-4c66-9355-7d1584ce4d64 failed
Traceback (most recent call last):
File "/app/app/clients/sms/aws_pinpoint.py", line 37, in send_sms
response = self._client.send_text_message(
File "/app/.venv/lib/python3.10/site-packages/botocore/client.py", line 565, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/app/.venv/lib/python3.10/site-packages/botocore/client.py", line 1021, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.errorfactory.AccessDeniedException: An error occurred (AccessDeniedException) when calling the SendTextMessage operation: User: arn:aws:sts::239043911459:assumed-role/eks-worker-role/i-02248c15cedc52402 is not authorized to perform: sms-voice:SendTextMessage on resource: arn:aws:sms-voice:ca-central-1:239043911459:pool/pool-b20333ce1e4e49309ba1db3bf94a3f57 because no identity-based policy allows the sms-voice:SendTextMessage action
Oddly the SMS was retried 5 minutes later and sent with SNS - should look into this
ok! This PR added the action that was needed https://github.com/cds-snc/notification-terraform/pull/1314
And this time the pinpoint message was sent with pinpoint!
The lambda processed the receipt, but I see in the celery logs
[2024-05-08 20:21:37,476: ERROR/MainProcess] Received unregistered task of type 'process-pinpoint-result'.
The message has been ignored and discarded.
Possibly a name mismatch? Or I forgot to register the task... that sounds more likely.
In any case, progress!
ok, locally when I start celery I see the task listed
...
. process-job
. process-pinpoint-result
. process-ses-result
. process-sns-result
. process-virus-scan-error
...
BUT looking at when celery pod celery-sms-send-scalable-77bcd554b8-5jcbf
starts up I don't see the pinpoint task listed!
...
. process-job
. process-ses-result
. process-sns-result
. process-virus-scan-error
...
So! what's the difference between starting locally and in staging?
Both start with sh scripts/run_celery_send_sms.sh
Note that in the celery pod the code is there in process_pinpoint_receipts_tasks.py
If I run that script on an api node in k8s then I get the same results (ie no process-pinpoint-result
)
AFAIK this is where we tell celery what the tasks are
CELERY_IMPORTS = (
"app.celery.tasks",
"app.celery.scheduled_tasks",
"app.celery.reporting_tasks",
"app.celery.nightly_tasks",
)
But process-sns-result
isn't in there? it's in app.celery.process_sns_receipts_tasks
. So how is it registered?
UPDATE: now I can't see the new task locally either.
ok, but at least I can fix it locally then. https://github.com/cds-snc/notification-api/pull/2175
It's working!
but successful pinpoint sends generate two receipts, one to say it's gone to the carrier, and one to say it's gone to the phone. We want to ignore the one to the carrier. https://github.com/cds-snc/notification-api/pull/2176
the ProviderDetails
table has a pinpoint provider still there, it's at the end (priority 50) and inactive. Locally I:
We'd probably want a proper db migration to do this.
ok! this new PR actually works!
Have to add a test or two around this still but can see the light at the end of the tunnel!
Added code to fall back to SNS if
Also added a bunch of tests for all this. Unfortunately tests are not all passing in CI despite passing locally.
ok, got the tests working and tested again locally. I think this is ready for :eyes: https://github.com/cds-snc/notification-api/pull/2173
REady for review -- someone will look at it today. Probably Jimmy
Reviewed, left a few questions and comments but nothing major to change. Looks good overall.
Steve will review today!
Jimmy to review the review of the review.
Jimmy approved the PR, Steve will merge after the release is done this morning to test it a bit in staging.
Code merged into staging, and released to prod. SNS still used by both (as env vars are not set).
Scope was reduced on this card, so we need to review/QA this one and determine if it's ready to be closed.
Had to turn Pinpoint off in staging, will turn back on today hopefully and we can then QA
Removed the QA steps: we will test this feature in the "QA flags" cards that turn this feature on.
Description
As a system ops, I need to send a notification through a designated pool of code so that I can isolate codes depending on services and usage, i.e. short code versus dedicated long code or random pool of codes.
As a product manager, I need to send notifications through the short code, So that GCNotify gains more trust, reliability and throughput.
As a policy maker, I need to send notifications through the short code, so that 2fa notifications are only sent through it as agreed upon with the telecoms (assuming our short code stays exclusive for the 2FA usage).
WHY are we building?
WHAT are we building?
VALUE created by our solution
Out of scope
Acceptance Criteria
Red flags
🚩 The SMS delivery receipts might come from a different source than the one for SNS, according to past investigation. If this turns out to be true, feel free to put this card to blocked and create a new task to tackle the new source of notifications with proper plumbing.
Additional information