bcgov / performance

Performance
Apache License 2.0
0 stars 2 forks source link

Utilize email queueing for bulk mail sending and create new "Send From" address #1073

Closed telusdcinco closed 11 months ago

telusdcinco commented 1 year ago

Current Situation

Bulk email sends from our app failing to go through to all recipients. All our app’s audit logs show the email processes running and sending successfully but not all emails are being received by recipient accounts.

Example Email originates from the Performance Development Platform (PDP). Email is sent to recipients via BCC.

Sending address: no-reply@gov.bc.ca (we discussed with the need to update our send addresses to something more unique)

Time sent: 2023-07-14 17:00:28

Subject: PDP - A New Goal Has Been Added to Your Goal Bank

Intended audience: 766 recipients at the Public Service Agency

The PDP shows all 766 emails generated; we could find no major issues with email addresses which are pulled from PeopleSoft so shouldn't be causing failure; however anecdotally it sounds like about half of people didn’t receive the notification in their inbox. This means that the email is going out, but not reaching a random percentage of folks.

Additional Info on Current Setup in PDP Private Zenhub Image

Suggestions from CITZ

  1. Use a unique sender address. This will help isolate your traffic so that you are not impacted by nor impacting other senders.
  2. The system you use to send must have a queuing system in place. Our apps.smtp.gov.bc.ca SMTP gateway is made up of 4 Microsoft Edge servers. Their resources are finite and the load on them at any given time fluxuates as they are available to all internal senders. The edge servers will push back and return wait, try shortly SMTP commands to new email submissions. So, your system needs to follow and requeue and send process. If not it’s a one and done situation where those emails not initially accepted fail.
  3. It would be good if your choice for sending email was a real address and had a mailbox. This would allow for the review of NDR emails. Plus you would be able to weed out and non-existent mailboxes or wrong addresses.
  4. Try and avoid exceeding 10k emails per day or the sending email address could potentially be blocked by the Microsoft cloud filtering system. Our current system was not intended for large scale bulk sending uses.

Solutions

After we do these updates, we can test again and get back to CITZ for additional assistance. The unique send from address will help them track and audit our activity.

Update send from address to: PDP.No-Reply@gov.bc.ca.

Travis-A-Clark commented 1 year ago

Waiting to hear back from Chris Blackhall at CITZ re: missing PDP emails. He is due back from vacation on Aug 21.

Abuchana commented 1 year ago

@Travis-A-Clark when you get a moment, can you send me or post the email from Chris so that I might get Erik's assistance with this?

Travis-A-Clark commented 1 year ago

Created the new send from address with Outlook account. Travis and Jessica are owners and have access.

PDP.NoReply@gov.bc.ca

jizhaogit commented 1 year ago

We checked the one pod that is older with terminal command "ps -eF" found the queue is not started.

PECSF has the similar situation about the queue may not be able to started successfully with pod rebuild. James' solution is do the following steps after very time we do the product migration:

After migration, go to check all the pods Created time. if it is earlier we need manually delete it to let system re-generate a new one. log into each pods' terminal and run "ps -eF" to confirm if the queue process "php /var/www/html/artisan queue:work" is up and running. if queue process is not running, we need to manually start it by using the following command line: > cd /var/www/html > nohup php /var/www/html/artisan queue:work --tries=3 --timeout=0 --memory=512 > ./storage/logs/queue-work.log &

jizhaogit commented 1 year ago
  1. the sending email is changed to "PDP.NoReply@gov.bc.ca".
  2. we currently will do the "ps -eF" checking by manually every time we do the deployment and pods restarting.
jizhaogit commented 12 months ago
  1. a page is created for check mail queue process from website instead of go to console.
    URL: https://performance-332842-test.apps.silver.devops.gov.bc.ca/sysadmin/queue/processes
jizhaogit commented 12 months ago

prod file change will be done at 2023-11-17 5:00PM PST as discussed in the meeting with Travis

Travis-A-Clark commented 12 months ago

Will monitor email traffic and bounce backs after push to Prod. May still have to explore additional queueing options, as per Chris Blackhall's initial suggestions. We can do this on a separate ticket after we have more data.

  1. The system you use to send must have a queuing system in place. Our apps.smtp.gov.bc.ca SMTP gateway is made up of 4 Microsoft Edge servers. Their resources are finite and the load on them at any given time fluxuates as they are available to all internal senders. The edge servers will push back and return wait, try shortly SMTP commands to new email submissions. So, your system needs to follow and requeue and send process. If not it’s a one and done situation where those emails not initially accepted fail.
jessicahjwu commented 8 months ago

We provided five email addresses for Chris to further investigate on, and it turned out that those user accounts were being on hold. I searched in PeopleSoft, majority of these five employee's records have not been updated with any kind of leave status, whilst one had a future dated long term leave status in place. The account might have been temporarily disabled/ on hold before PS status kicks in, therefore PDP was still sending notifications to the users.

For future references, if the bounce-back message is displayed with reasons such as "The maximum message size that's allowed is 0 KB. This message is 10 KB." then we can confirm that is due to user account being on hold.

From Chris: "The four gov.bc.ca accounts are all On Hold. That usually also results in the mailboxes having a special config where the max receive size gets set to zero."