ejfinneran / ratelimit

A Redis-backed rate limiter written in Ruby
MIT License
260 stars 57 forks source link

Implement threadsafe version of exec_within_threshold #35

Open jamespeerless opened 5 years ago

jamespeerless commented 5 years ago

When using exec_within_threshold in an environment with multiple threads/processes you can run into an issue where multiple threads read the count for a given subject at the same time. If the read count is right below the current threshold then all threads will be allowed to execute before any of them can perform an add that pushes the count over the threshold.

For example, if you had three threads and a ratelimit with a threshold of 20 in 600 seconds and the current count was at 19. All three threads would read the current count of 19 and the exceeded? check used in exec_within_threshold would evaluate to false so all three threads would execute the block.

The fix requires that we can read and increment the count in an atomic way. There is no way to do this natively with redis out-of-the-box. I had to implement a short Lua Script that can perform the count, check, and increment. Using this new script we are able to implement a threadsafe version of exec_within_threshold that also automatically increments the subject.

I wrote some tests to verify the issue and that this fixes it in a project where we use the ratelimit gem.

The test that exposes the issue where exec_within_threshold allows more than the threshold number of executions in an interval is here:

rlKey = SecureRandom.uuid
rlSubject = SecureRandom.uuid
ratelimit = Ratelimit.new(rlKey,{:redis => $redis})

  statusMap = { stop: 0, sleep: 0, run: 0, finished: 0}

  threads = [
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }},
    Thread.new { ratelimit.exec_within_threshold(rlSubject, {interval: 10, threshold: 1 }) { sleep 0.5; ratelimit.add(rlSubject) }}
  ]

  sleep 1

  threads.each do |t| 
    status_symbol = t.status ? t.status.to_sym : :finished
    statusMap[status_symbol] += 1
  end

  expect(statusMap[:finished]).to be > 1

The test with the fixed implementation looks like this:

rlKey = SecureRandom.uuid
rlSubject = SecureRandom.uuid
ratelimit = Ratelimit.new(rlKey, {:redis => $redis})

  statusMap = { stop: 0, sleep: 0, run: 0, finished: 0}
  threads = [
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })},
    Thread.new { ratelimit.exec_and_increment_within_threshold(rlSubject, {interval: 10, threshold: 1 })}
  ]

 sleep 1

  threads.each do |t| 
    status_symbol = t.status ? t.status.to_sym : :finished
    statusMap[status_symbol] += 1
  end

  expect(statusMap[:finished]).to eq(1)
coveralls commented 5 years ago

Coverage Status

Coverage decreased (-7.6%) to 90.845% when pulling 509fabc5d3a94f24bd67ce95c40526ffc8f195bf on jamespeerless:threadsafe-exec-within-threshold-no-lock into 0e60d3484b4b1f984f806b1c89c601adf92fd544 on ejfinneran:master.

coveralls commented 5 years ago

Coverage Status

Coverage decreased (-8.1%) to 90.278% when pulling dc46e5fe60c09dda18e6d531adf2a77e3f23678d on jamespeerless:threadsafe-exec-within-threshold-no-lock into 0e60d3484b4b1f984f806b1c89c601adf92fd544 on ejfinneran:master.

feliperaul commented 3 years ago

@jamespeerless James, have you ever gotten around implementing the idea of not evaluating that lua script every time, and instead checking if it was already evaluated using it's SHA?