phatboyg / GreenPipes

An asynchronous pipe implementation for the Task Parallel Library
Apache License 2.0
151 stars 55 forks source link

ExponentialRetryPolicy materializes GetIntervals() #31

Closed Timovzl closed 3 years ago

Timovzl commented 3 years ago

The constructor of ExponentialRetryPolicy materializes the IEnumerable<TimeSpan> returned from GetIntervals(). This seems directly opposed to its nature:

When using retryLimit: Int32.MaxValue, which GetIntervals() explicitly special-cases, the loop is infinite (until ToArray() tries to allocate an array that is too large).

Even when avoiding Int32.MaxValue, when trying to accommodate a very large number of attempts, the materialized array can take a lot of memory, when the intervals could just be calculated on-the-fly. Materialization also incurs a lot of calls to Random.Next() that would otherwise have been amortized over the lifetime of the policy instance.

A good alternative property type to TimeSpan[] might be IEnumerator<TimeSpan>, since it enforces forward-only behavior. It is a public property, unfortunately, so I'm not sure how far the impact reaches.

I ran into this issue when I was trying to use exponential message retries in MassTransit 7.0.7.

Timovzl commented 3 years ago

As a side-note, there also seems to be the potential for a bug in the _retryLimit == int.MaxValue special case in GetIntervals():

Random random = new Random();
for (int i = 0; _retryLimit == int.MaxValue || i < _retryLimit; i++)
{
    int num = (int)Math.Min((double)_minInterval + Math.Pow(2.0, i) * (double)random.Next(_lowInterval, _highInterval), _maxInterval);
    yield return TimeSpan.FromMilliseconds(num);
}

When i wraps around, i will be negative, resulting in negative time spans, if I'm not mistaken. Next, i will be small again (0, 1, 2, ...), resulting in shorter time spans than were returned before, when i was large.

Edit: I just realized unchecked is not the default behavior, so instead of the wrap-around behavior I describe, an exception would be thrown. Different result, same point though.

_retryLimit == int.MaxValue seems to indicate the intent of supporting infinite retries (for which I salute you). The use of a ulong as the loop variable might help to avoid the wrap-around troubles.

phatboyg commented 3 years ago

Yeah, it's a sharp tool. Specifying a retry limit that high is, well, not recommended.

phatboyg commented 3 years ago

FYI, I finally got around to cleaning this up a bit.

Timovzl commented 3 years ago

Is it intentional that exceeding _maxInterval short-circuits the loop on line 68?

I'm pondering what my expectations would be. It's fairly easy to specify a maxInterval that is reached more quickly than a given retryLimit, due the the former's exponential nature.

If I specify maxInterval=TimeSpan.FromSeconds(60) and retryLimit=100, then I would honestly expect 100 retries if necessary. After all, to me, the retryLimit is something I am taking clear control over, whereas the maxInterval is just a maximum, where I expect things to scale in between in some exponential fashion.

Currently, to achieve the exact retryLimit I am specifying, I have to (A) know the scaling formula and (B) predict the result of the random generator.

I would expect one of two scenarios:

Since the second option gets rather convoluted with being exponential, I would actually expect the first.

Might that make more sense than specifying a maxInterval and retryLimit where only one of them is going to end up being the determining factor?

phatboyg commented 3 years ago

Did you miss something? It still retries up to the RetryLimit. Once the delta reaches the maximum interval, it stops calculating and just uses that max interval for each subsequent retry.

Timovzl commented 3 years ago

I did miss something! I'm guessing this is what causes the behavior you describe:

delta = (int)Math.Min(_minInterval + Math.Pow(2, i) * random.Next(_lowInterval, _highInterval), _maxInterval);

So the delta never exceeds _maxInterval.

My confusion stemmed from the loop containing that line:

for (var i = 0; i < RetryLimit && delta < _maxInterval; i++)

It has a termination condition once delta reaches _maxInterval. I'm wondering if that condition can be hit at all.

phatboyg commented 3 years ago

It does, and the unit tests verify it.

Timovzl commented 3 years ago

I stand corrected.

phatboyg commented 3 years ago

No worries, appreciate the concern to recheck the logic!