hypothesis / lms

LTI app for integrating with learning management systems
BSD 2-Clause "Simplified" License
46 stars 14 forks source link

Spike: how will we actually send instructor digest emails? #4909

Closed seanh closed 1 year ago

seanh commented 1 year ago

For example, using Mandrill?

seanh commented 1 year ago

Mandrill has been renamed to Mailchimp Transactional after it got acquired by Mailchimp.

seanh commented 1 year ago

Here's my rough initial plan:

  1. We'll use the mailchimp-transactional PyPI package (their official Python client)

  2. We'll use their templates feature. So the template for the email subjects and bodies lives in Mailchimp (where it can be easily edited by non-developers) and when it wants to send an email our app just posts the template input data/variables to the Mailchimp API. Slack thread about whether we should use Mailchimp templates.

  3. We'll call their send-template API (mailchimp_client.messages.send_template() in the Python client) to actually send the emails

  4. This means making one network request to the Mailchimp API per email

  5. We'll use the async=True argument to send_template(). This enables a background sending mode that is optimized for bulk sending. The API responds immediately with 200 OK (without having actually sent the email) and then Mailchimp sends the emails in the background on their end

  6. We'll need to set up a webhook receiving endpoint to receive rejections from Mailchimp: To handle rejections when sending in async mode, set up a webhook for the 'reject' event.. Question: what should we do if we receive rejections from Mailchimp? Trigger an alarm?

  7. I'm imagining that we'll use one celery task per email to actually send the emails. Then we can parallelise it as much as we want, across as many celery workers and instances as we want. This does mean that we need to be able to actually get thousands of celery tasks onto the queue each night, which might be a bottleneck? Actually generating the data for all these emails is another potential bottleneck (especially when hitting Elasticsearch and Postgres)

  8. We're going to have to monitor our Mailchimp reputation score which determines how many emails Mailchimp will let us send per hour. The reputation score depends on things like how many of our emails bounce (e.g. because the user's email address doesn't work), how often users click the spam button or Mailchimp complaints link on our emails, etc.

    If using a separate Mailchimp subaccount for email digests (see below) I think we may have to roll the feature out slowly in order to build up a reputation score and hourly quota.

  9. I think we might want to use a separate Mailchimp subaccount for the instructor digest emails. Separate from the existing subaccount that we use for account activation emails, password resets, and reply notifications. This is because I think separate subaccounts have their own separate reputation scores and rate limits, although I'm not 100% sure of this.

    Instructor digest emails are quite different from our other types of emails: they're a much higher volume and I think the risk of them having reputation problems is much higher. If we do run into reputation and throttling problems with instructor digest emails we don't want that to affect account activation, password reset and reply notification emails.

  10. I think we probably want to add priorities to the existing emails that we already send (account activations etc) to make sure that they don't get swamped by digest emails

seanh commented 1 year ago

I'm probably going to have more thoughts on this later as I actually figure it out and implement it. But for now I'll close the spike as done for now