safwank / ElixirRetry

Simple Elixir macros for linear retry, exponential backoff and wait with composable delays
Other
441 stars 32 forks source link

Unexepected behavior in expire/2 & crash when combining with randomize/2 #33

Closed x4lldux closed 5 years ago

x4lldux commented 5 years ago

There's a bug in randomize/2 and an unexpected behavior in expire/2.

import Retry.DelayStreams

linear_backoff(200, 200)
|> expiry(1_000)
|> randomize()
# simulate a job between retries
|> Stream.map(fn d -> Process.sleep 500; d end)
|> Enum.take(50)

#[

Will crash with: (FunctionClauseError) no function clause matching in :rand.uniform_s/2 because :rand.uniform_s/2 was expecting an integer of at least 1 but got 0, because expire/2 will return a delay of 0 in some cases. Without randomize/2 this will return [200, 400, 0].

Modifying randomize/2 function to forward 0 is a solution, but I think the more important thing here is the unexpected behavior of expire/2 which can return 0 that basically means "no delay - retry immediately!".

Maybe there should be a minimal delay here (instead of 0) https://github.com/safwank/ElixirRetry/blob/1e00f1244555bc2bd6ab6553f877200acc43debe/lib/retry/delay_streams.ex#L196 Or dropping the "one last try" and ending the stream before the time budget expires https://github.com/safwank/ElixirRetry/blob/1e00f1244555bc2bd6ab6553f877200acc43debe/lib/retry/delay_streams.ex#L203-L205

x4lldux commented 5 years ago

@safwank I'm happy to create a pull request, but droping the "one last try" might be a breaking change so I'd rather know first in which way are you leaning.

safwank commented 5 years ago

@X4lldux I'll have a think and get back to you.

safwank commented 5 years ago

@X4lldux Sorry it's taken awhile, I've been really busy with work. Since it's clearly a bug, I'm leaning towards setting a minimum delay of 1ms in expiry/2. I'll push a fix in a jiffy. Thanks for pointing it out!

x4lldux commented 5 years ago

Do you mean that if the remaining time is 0 you will add 1ms? Or if there's less than 1ms of remaining time, the stream will halt ? I would opt into the second one - strong guarantees not to go beyond the expiry limit - and making that configurable for a user:

linear_backoff(200, 200)
|> expiry(1_000, min_delay: 100)
# If there's less than 100ms of remaining time, halt the stream
safwank commented 5 years ago

Actually the issue affects any type of randomization, and that includes randomize/2 and jitter/1. What I've done is create a randomization function that forces a minimum of 1 and use it in both functions. You can see my commits here and here. Let me know what you think.

Making the minimum delay configurable is a good idea, but I'll consider it as a separate enhancement issue/PR.

x4lldux commented 5 years ago

In my counterexample I didn't even use randomize/2, because the main problem described here is not the bug in ranodmize/2 or jitter/2 but the unexpected behavior of expiry/2. I believe that minimum delay should be set in expiry where the problem occurs. Not in randomize & jitter, where the problem is just manifested because of mishandling the inputed delay (randomize & jitter should just return 0 when inputed with 0).

safwank commented 5 years ago

@X4lldux Check out this commit and let me know what you think.

x4lldux commented 5 years ago

@safwank yeah, looks good. it's explicit now, that this kind of situation can happen. I'm closing it 😃