console-rs / indicatif

A command line progress reporting library for Rust
MIT License
4.22k stars 238 forks source link

Switch Estimator to use an exponential weighting #539

Closed afontenot closed 1 year ago

afontenot commented 1 year ago

This is an implementation of an exponentially weighted running average with a tuning parameter (currently a constant in this implementation):

This implementation does double smoothing by applying the running average to itself. The result avoids undesirable instantaneous movements in the estimate when large updates occur.

The exponential estimator works by keeping a running tally, where an existing tally that has aged age seconds is reweighted such that

weight ^ (ewa / age) = 0.1

For instance, data aged 5 seconds with a 15 second weight parameter would receive weight = 0.1 ^ (5/15) = 0.464. If it then ages another 10 seconds, it would receive weight = 0.1 ^ (10/15) = 0.215. After being multiplied by both weights, it would have a weight of 0.464 * 0.215 = 0.1, as expected.

A couple of basic features are also implemented for higher quality estimates:

afontenot commented 1 year ago

Edit: all the issues in this comment have now been fixed. Leaving for posterity.

For the sake of having a draft PR to discuss and test, I put the now arguments back in the Estimator functions so that it would be trivial for me to fix / add tests without having to worry about time.

A few notes on things that might need changing before this is merged:

  1. I added a few functions to the Estimator to allow configuring it. You can change the weighting and also disable / enable the double smoothing (using an exponentially weighted average of an exponentially weighted average). These aren't exposed in any way currently, and I wasn't sure what the right way to do that might be.

  2. The tests need a look over to make sure everything important is covered. I kept a sanity check, added a correctness check for both single and double EWA, added a test to make sure resetting the ETA behaves as expected, and reworked one test that makes sure the estimator behaves correctly when the position is rewound.

  3. It's not clear to me exactly what issue https://github.com/console-rs/indicatif/issues/480 is talking about. I think the behavior I have in this area preserves or improves on the status quo, but I'm not totally sure. Basically, I think it makes sense to force a estimator reset when the position is rewound - but you don't want to set the previous position to zero, because that would result in a huge event rate if the true position isn't zero. Instead, I only reset the estimate and trust that the update function has been called with the correct value.

  4. I didn't come up with an appropriate debug fmt for the Estimator after making my changes.

djc commented 1 year ago

This is looking great, thanks for working on it!

For the sake of having a draft PR to discuss and test, I put the now arguments back in the Estimator functions so that it would be trivial for me to fix / add tests without having to worry about time.

Nice. The reason these are here is (a) for performance, calling Instant::now() means doing a syscall which can be somewhat expensive relative to the other stuff we do, and (b) for consistency, we usually want to use a consistent timestamp for all of a given invocation of the public API. If we want to deviate from these principles for some reason, there should be a clear comment explaining why the deviation is necessary/useful.

A few notes on things that might need changing before this is merged:

  1. I added a few functions to the Estimator to allow configuring it. You can change the weighting and also disable / enable the double smoothing (using an exponentially weighted average of an exponentially weighted average). These aren't exposed in any way currently, and I wasn't sure what the right way to do that might be.

I've addressed this in my review feedback.

  1. The tests need a look over to make sure everything important is covered. I kept a sanity check, added a correctness check for both single and double EWA, added a test to make sure resetting the ETA behaves as expected, and reworked one test that makes sure the estimator behaves correctly when the position is rewound.

  2. It's not clear to me exactly what issue ProgressBarIter and finding stream length via seeking breaks rate estimation #480 is talking about. I think the behavior I have in this area preserves or improves on the status quo, but I'm not totally sure. Basically, I think it makes sense to force a estimator reset when the position is rewound - but you don't want to set the previous position to zero, because that would result in a huge event rate if the true position isn't zero. Instead, I only reset the estimate and trust that the update function has been called with the correct value.

Right, I think this makes sense.

  1. I didn't come up with an appropriate debug fmt for the Estimator after making my changes.

Maybe we can go back to just deriving Debug? I don't think we deeply care about the Debug values, though likely we'd want something that makes your life (as a contributor) or my life (as a maintainer) easier when debugging the Estimator.

afontenot commented 1 year ago

Most outstanding issues resolved. Heads up that I will be force pushing in a moment to split out the seconds_per_step() to steps_per_second() change as requested.

djc commented 1 year ago

@chris-laplante have opinions on this stuff? I've felt like #394 has been the biggest open issue and this feels like a good step forward.

afontenot commented 1 year ago

@djc I've fixed all the issues I had with the code, so I've marked it ready for review. Happy to make additional changes you want to see. Wasn't totally sure how you wanted the comments on the weight function formatted - I made them doc comments but code comments would maybe make more sense, as these are implementation details.

The remaining nit I'd like to poke at is trying to extract the debiasing from the update function so it doesn't have to be done every update. I think it should be possible to do it only once, when generating an estimate (even for the double EWA), but it's mathematically more complicated. This could always wait for a future PR.

chris-laplante commented 1 year ago

@chris-laplante have opinions on this stuff? I've felt like #394 has been the biggest open issue and this feels like a good step forward.

I've been slammed at work, but I will try to take some time later this week to go over this in detail.

djc commented 1 year ago

@chris-laplante what's the chance that you'll get to it this week? Otherwise let's just go ahead and I'll take care of it and you can follow up later?

afontenot commented 1 year ago

@djc I'd like to replace the self.prev tuple with independent self.prev_steps and self.prev_time for better readability. That okay with you?

djc commented 1 year ago

Yup, go ahead. Please make a separate commit that only does that, though.

afontenot commented 1 year ago

Yup, go ahead. Please make a separate commit that only does that, though.

I've now done so, and force pushed so that it is grouped with the other refactoring change before my exponential weighting work.

I also added some more substantial commentary on debiasing to the ewa commit, so let me know if you think that reads clearly enough now.

chris-laplante commented 1 year ago

@chris-laplante what's the chance that you'll get to it this week? Otherwise let's just go ahead and I'll take care of it and you can follow up later?

Sorry, was on vacation last week. I will follow up with it later. Thanks for understanding!

chris-laplante commented 1 year ago

I'm no statistician, but I looked this over and I think it makes sense. Your explanation of "debiasing" also make sense. I do also wonder if there is a better term than "debias" we could use - I tried searching for it in the context of exponential smoothing and it doesn't seem to come up in literature much. But I also don't think it's a big deal now that you've added plenty of comments to explain the approach. This definitely seems like a great step forward, so thanks again for your work!

afontenot commented 1 year ago

I do also wonder if there is a better term than "debias" we could use - I tried searching for it in the context of exponential smoothing and it doesn't seem to come up in literature much.

I agree that "debias" is not an especially accurate description of what's happening. Strictly speaking, the problem with the naive estimate is that it assumes a non-zero weight for the initial value of the estimator, not that it is "biased" towards that initial value.

Maybe a better way of talking about this would be to call debiased estimates "normalized estimates" instead. After all, if $t_0$ is the current time, with positive time going back into the past, the estimate $E$ is a weighted average of samples for some weighting function $W(t)$:


E = (W(t_0) - W(t_1)) \cdot s_1 + (W(t_1) - W(t_2)) \cdot s_2 + \ldots + (W(t_{f-1}) - W(t_f)) \cdot s_f

where $s_f$ is the first collected sample.

This makes the problem fairly obvious. The sum of the weights in our weighted average isn't 1! As you can see from my plot of the weighting function the weights in the weighted average will only sum to 1 if the samples continue back into the infinite past.

Since we set the initial value of the estimator to 0, rather than getting a bogus estimate, all the weights for times prior to $t_f$ are canceled out instead. So we have an "underweight" weighted average, and need to normalize it so that the total weight is 1. Simply dividing by the total weight in the estimator, we retain the desired weight ratio between samples of various ages!


E_{norm} = \frac{E}{W(t_0) - W(t_f)}

@chris-laplante @djc Do you like this way of describing the concepts better? I can rewrite my explanatory comment to use the language of normalization rather than calling this debiasing. Or we could just call it a "correction", though that's a little vague.

Also, please let me know if there's anything you'd still like to see, as I think this is otherwise ready to be merged.

Side note on the statistical literature and why I initially chose the "debiasing" language:
I developed the approach here independently; a quick skim of the literature suggests that this is usually discussed as an "initialization" problem. The assumption is that the estimator *has* to take some initial value, and so you figure out a method for doing so. Either you just use the first sample, or the sample mean, or a MSE minimizing value, etc etc. Trying to set a reasonable initial value like this probably makes sense when doing a post-hoc analysis on data that *has* an (unknown) value before `t=0`, but makes less sense when producing a live estimate where there *is* no value before t=0 (because that point marks the start of `step` production). After developing this method, I went looking for anyone who had taken a similar approach to the problem, and found this one StackOverflow [answer](https://stackoverflow.com/a/23617678) which handles the initialization issue in the same way. I borrowed the "debiasing" language from that answer, as I hadn't come up with a better way to describe it.
afontenot commented 1 year ago

I've pushed a new commit that does as I suggested in the comment above, and hopefully includes enough commentary to clearly justify the behavior. I kept it as a second commit for now so you can see the change, but I'll squash it into the previous commit and force push if you want to clean up the history before merging.

afontenot commented 1 year ago

Looks great! If you squash these last two commits, I'll be happy to merge this.

Would you mind adding a commit that bumps the version to 0.17.5?

Done and done. Needed to rebase because the PR was based on 0.17.3.

djc commented 1 year ago

Thanks for all your work on this! 0.17.5 has made it out to crates.io.