Closed rlee287 closed 1 year ago
I think this is a duplicate of the discussion in #394. It would be great if someone can investigate this in more detail and propose a fix (potentially to revert some Estimator
changes to the 0.16 state).
After some investigation, I've managed to track down the root cause of my issue. While #394 is a discussion of computing ETA, I do not believe it is closely related as that issue does not discuss problems in rate estimation that arise with seeking.
I've linked to the relevant lines of code in the Estimator
struct below for reference:
A typical approach of finding the length of a Seek
object is through a function like the one below (copied from my MCVE):
pub fn seek_len(seekable: &mut dyn Seek) -> u64 {
let old_pos = seekable.stream_position().unwrap();
let len = seekable.seek(SeekFrom::End(0)).unwrap();
if old_pos != len {
seekable.seek(SeekFrom::Start(old_pos)).unwrap();
}
// return
len
}
This involves 2-3 seek operations:
stream_position()
is actually implemented)Unfortunately this seek pattern has unfortunate interactions with the rate estimator:
delta
, and this large seek-to-the-end step is recorded in the estimator, generating an absurdly large rate. prev
now contains (seek_len, time_of_seek_to_end)
.saturating_sub
operation.prev
is at the very end of the object whose progress is being tracked.I can think of two possible solutions:
Estimator
if the value received is equal to the length of the underlying object. This does not resolve a similar bug that would arise if users did a massive seek forwards and then back that did not line up with the end of the underlying object. This is also not possible for iterators that do not have a predetermined length. Having different behavior depending on whether the underlying object is Seek
or not might also require some form of specialization.Estimator
whenever a backwards seek/step is detected.I am leaning towards the second solution, but I would like to hear other people's opinions before submitting a PR for this.
Yeah, (2) definitely seems cleaner than (1). Of course it's still kind of a crappy heuristic, but I guess it's unlikely to do much harm.
MCVE:
Expected behavior: rate hovers slightly below 100/s
Actual behavior: rate is stuck at an absurdly high constant while progress bar counts upwards and drops to slightly below 100/s when the progress bar completes
This behavior could occur when a
ProgressBarIter
is passed to a generic function taking aRead + Seek
object.This is a regression from v0.16.