pentacent / keila

Open Source Newsletter Tool.
https://keila.io
GNU Affero General Public License v3.0
1.36k stars 67 forks source link

Add rate-limiting to Senders #107

Closed wmnnd closed 2 years ago

wmnnd commented 2 years ago

Oban’s FLOSS version doesn’t allow to configure rate-limiting for queues, but it’s possible to use the snoozing feature to implement rate-limiting with the help of another module like ex_rated.

Rate-limits should be configurable per sender (or per shared sender) and be configured as n messages per unit of time (e.g. second, minute, hour, day).

It’s worth considering whether it should be possible to configure more than one rate limit (e.g. 100 messages per minute and 1000 messages per hour)

panoramix360 commented 2 years ago

Me and @gbottari are thinking on work on this issue, is it okay for you?

Thank you!

wmnnd commented 2 years ago

@panoramix360 That would be great! Should I write the specs here in this issue so you can work with that or do you want to start by writing the full specs yourself?

panoramix360 commented 2 years ago

If you can share with us it'll be nice :D I think that will help

wmnnd commented 2 years ago

Do you already have an idea how you want to implement the configuration of the rate-limits? Because a single sender might be subject to multiple limits (e.g. max 10 messages/second and max 100 messages/hour - or something like that).

gbottari commented 2 years ago

I looked into ExRated. It seems like you can simply create a bucket for each sender using different message rates. However, controlling Oban's rate seems more involved because we need to do it in a way that won't trigger retries and timeouts. Do you have any thoughts on that?

wmnnd commented 2 years ago

Yes, I think using ex_rated is a good idea. We should probably add something like this to the Oban worker (just nicer, obviously :laughing:):

with {:ok, _} <- ExRated.check_rate("#{sender.id}_second", 1000, sender.rate_limits["second"]),
     {:ok, _} <- ExRated.check_rate("#{sender.id}_minute", 60_000, sender.rate_limits["minute"])
       # …
      do
         # send email
else
  _ -> {:snooze, seconds}
end

I suppose we could ideally also calculate for how many seconds the job needs to be snoozed.

panoramix360 commented 2 years ago

@wmnnd this week I'll try to contribute to this issue!

Sorry for my absence in updating here :(

I'll try to tackle this one.

panoramix360 commented 2 years ago

Hi @wmnn :)

Me and @gbottari added a simple implementation using ExRated.

For now, we are only considering minutes and created a new property in the sender called field :rate_limit_minutes, :integer.

I'm currently thinking of creating another one to represent seconds and let the user configure this on creating/updating a sender

We created a check_rate method on Sender as well:

def check_rate(struct) do
  ExRated.check_rate("sender-bucket-per-minute-#{struct.id}", 60_000, struct.rate_limit_minutes)
end

This function is called by Oban Worker when it will deliver the emails, and then limits the requests based on the result and return {:snooze, number}. Currently, the number used is proportional to the number of recipients to send the email like this:

{:snooze, 1 * n_recipients}

If we try to create a configuration for each unit of time, maybe this PR will be too long for this.

What do you think?

wmnnd commented 2 years ago

Hey @panoramix360, sorry for the late reply, that sounds great! Mabye for added clarity, it might be nice to call the field rate_limit_per_minute.

Adding additional fields for the rate limit per second and per hour will probably cover almost all cases :-) What do you think?

panoramix360 commented 2 years ago

Yeah, that would be good :D

I'll develop this and make a PR \o

panoramix360 commented 2 years ago

Hi @wmnnd!

I created these new fields on Sender Form, can you see if that's what you imagined it? Screenshot from 2022-06-07 13-03-55

I created the migrations and fields on Ecto. I put the default values like these: Screenshot from 2022-06-07 13-05-14

Now I only need to update check_rate method to comply with these new fields.

Hope to hear from you so I can create the PR :D

panoramix360 commented 2 years ago

After doing some tests, I think it's obvious that 1 request per second is not something real.

Which default values do you think that we could put?

Maybe something like that: rate_limit_per_hour = 10_000 rate_limit_per_minute = 1_000 rate_limit_per_second = 500

wmnnd commented 2 years ago

@panoramix360 Do you think we should have a default rate limit at the database level? I think my personal tendency would be to not set it at the DB level and add it at the UI or application level. But I don’t think it’s necessary to have a default at all for the first version.

panoramix360 commented 2 years ago

@wmnnd Yeah, I agree with you.

I'll remove the default limit from the database and from the UI :)

But if the user doesn't fill the rate limit field then it will have no limit, correct?

I'll update the code and overall I think it's ready for a PR.

wmnnd commented 2 years ago

@panoramix360 Awesome! Yes, I think we’ll be good without a default limit :-)

panoramix360 commented 2 years ago

@wmnnd Created PR #137 for this issue, waiting for your consideration :)

wmnnd commented 2 years ago

Implemented in #137.