verigak / progress

Easy to use progress bars for Python
ISC License
1.41k stars 179 forks source link

More accurate progress bar #24

Closed praiskup closed 8 years ago

praiskup commented 8 years ago

Follow up for verigak/progress/pull/23

praiskup commented 8 years ago

The algorithm was not changed by this PR. The problem is that the "old" implementation works with incredibly small float numbers, doing their sum and dividing ... which makes it very inaccurate. image

The code that uses it: https://git.fedorahosted.org/cgit/copr.git/tree/cli/copr_cli/main.py?id=15116133574defe50#n165

praiskup commented 8 years ago

@verigak wrote:

As an aside, doesn't _dt always contain at maximum just one item that way?

The _dt contains at most self.sma_window items.

verigak commented 8 years ago

Can you add a simple test that shows the problem?

verigak commented 8 years ago

The _dt contains at most self.sma_window items.

Can you please verify this? Reading your code doesn't look like it.

praiskup commented 8 years ago

Can you please verify this? Reading your code doesn't look like it.

Commented in code.

Can you add a simple test that shows the problem?

I'm not sure how to write the test. My theory: The problem causing this behavior is some cache underneath (at least I believe). It sounds like there is a bit larger chunk of cache (> 80k) which gets filled pretty quickly, but the 'next()' callback is called after '8k' (> 10 times per chunk). Then I believe all the 'next()' callbacks are called in batch after whole 80k chunk of cache is filled ..

praiskup commented 8 years ago

I mean, as we operate with time(), having a test case which would prove that this PR fixes the issue is not easy to write (probably only non-deterministic approaches come to my mind).

verigak commented 8 years ago

I'm sorry but I can not accept this code. The algorithm is hacky (e.g. what is sma_delta and why is it 0.3?), it diverges from an established algorithm without making a case why it is better (only anecdotal evidence is given) and there are no test cases to at least demonstrate the problem and show how your solution fixes it.

What's worse, your implementation is not working as expected anyway, making me worried as to why you are so confident in it. Try adding print(len(self._dt)) at the end of next() and you will see that the length never gets larger than 1.

praiskup commented 8 years ago

That sounds like a bug. Let me check please.

praiskup commented 8 years ago

There was a bug so, according to actual documentation, the moving average was calculated just for the last 0.3 seconds (one item in _dt). Fixed.

I agree that this deserves a better documentation - done. Please have a look.

praiskup commented 8 years ago

@verigak is it better? If there is something unclear, let me know -- I can fix that.

praiskup commented 8 years ago

@verigak ping. I've fixed the bugs you spotted (thanks for that), so is there anything I can fix or make more clear?

verigak commented 8 years ago

I see what you want to do and I think there is value to having a time based window, but your implementation is overcomplicated. You can achieve the same thing with a lot less complexity. e.g.:

ts = time()
self._dt.append(ts)
while self._dt and ts - self._dt[0] > self.time_window:
    self._dt.popleft()

Assume that self.time_window is 3 (seconds). Then to calculate the average:

return (self._dt[-1] - self._dt[0]) / len(self._dt)
praiskup commented 8 years ago

I believe that the code you propose is not right. You expect that the next() is always called with the same N, otherwise the average wouldn't be correct.

verigak commented 8 years ago

Right, it can be fixed using your solution though and still be simpler and more clear of what is going on. e.g.

ts = time()
self._dt.append((ts, n))

# Clear expired items
while self._dt and ts - self._dt[0][0] > self.time_window:
    self._dt.popleft()

and

t0 = self._dt[0][0]
t1 = self._dt[-1][0]
nitems = sum(n for (_, n) in self._dt)
return (t1 - t0) / nitems
praiskup commented 8 years ago

Right, I iterated the similar way as you do. But this would be way too space/time expensive -- very large _dt queue //space// (about 50k items for 1Gbit and next() called for each 8kB) and quite a long iterations through _dt //time//.

praiskup commented 8 years ago

The sum(n for (_, n) in self._dt) would be especially expensive I believe.

praiskup commented 8 years ago

Ping?

verigak commented 8 years ago

Please have a look at the referenced commit, it should solve this issue in a more straightforward way. Let me know what you think.

praiskup commented 8 years ago

The progressbar is now broken, and I can't understand why. Your patch still works with very small numbers (and only ten of them)? Dunno. Please consider commenting your code, and for future PRs being more inclusive instead of NIH -- I wasted nontrivial time to send you correct fix. See how the progress bar behaves the first 5 to 10 seconds: http://praiskup.fedorapeople.org/progress-fix-broken-again.gif

praiskup commented 8 years ago

Here can be seen the difference. The correct code is in https://github.com/praiskup/progress (master) http://praiskup.fedorapeople.org/praiskup-progress.gif

verigak commented 8 years ago

I'm sorry you think you wasted your time, but it is not a matter of NIH, I had issues with your PR that were never addressed. Your changes are too intrusive and don't take into consideration other users of the library, just your use case. I asked since the beginning to submit a minimal test case to demonstrate the problem, but have only received gifs. And you were so stubborn insisting on your solution that it took me 3 messages just to make you acknowledge that you had a serious bug.

I would recommend to try adjusting sma_window and time_threshold to see if they improve your case. Otherwise I will have to insist on some code that reproduces the issue.

saxtouri commented 8 years ago

Very entertaining thread, keep it up, guys!

praiskup commented 8 years ago

Ach, why I think I wasted my time is that there is now probably even more broken code than before. I'm the stubborn here :( everything from my side was posted with good intention, please read again. If you asked, I would do the review too.

To give it one more try, I try to be constructive: Playing with constants does not help. @verigak could you please try Fedora and see that it is really broken (if you don't believe the gifs are real)?

praiskup commented 8 years ago

Ok, I tried to do review.

At least until first time_threshold comes, what is the expected reported avg() value? Should not avg always count with _pending? Because I'm afraid that is the reason why the implementation now always shows lower speed that it actually runs..

Do you still @verigak consider the patches from this PR so it makes sense to do a rebase? If that is something unclear or you feel I am going to break something, lets discuss it.. If you insist on test-case, I'm fine to give up ;) it is still just trivial progress bar, I can maintain my own fork or possibly switch to python-progressbar https://github.com/niltonvolpato/python-progressbar

verigak commented 8 years ago

It is not that I don't believe you, I just don't have any way to debug it myself. What do you mean by saying try Fedora? Do you have a way for me to reproduce it?

praiskup commented 8 years ago

Yeah, you can try copr tool yourself, it is reproducible.

verigak commented 8 years ago

OK, I'll give it a try, do you have a copr command I can use?

praiskup commented 8 years ago

Other than that, I think I see the issue now. You count the average like (sum averages)/(len averages). Consider the first average is 1 (how long goes one byte through network), second average is 0.5 (2 bytes per second), the third is 0.01 (100 bytes per second). Each one average describes 0.1s! That is, after 0.3 seconds you have a result 0.53 (~2B per 0.3 seconds) ;) But you should have reported something like 34bytes/seconds. Consider that your connection is usually slower at the beginning of the upload until everything gets converged.

praiskup commented 8 years ago

Your algorithm starts to work once the queue begins to be limited and the slower windows are out from the queue.

The copr command can be installed by dnf install copr-cli, or you can try to use https://github.com/fedora-copr/copr/blob/master/cli/copr wrapper directly from git. But this should be not necessary, it is obvious why the average is not accurate -- it used to be like that before, but using the 0.1 time window underlined the issue probably.

praiskup commented 8 years ago

There is obvious workaround (can be shortened using some functional equivalent):

-        return sum(self._dt) / len(self._dt) if self._dt else 0
+        all = 0
+        for i in self._dt:
+            all += 0.1/i
+
+        return (len(self._dt) * 0.1)/all if self._dt else 0

You should also consider the _pending chunk which is not in avg(), the patches I proposed dealt with those. ... but again, this is linear complexity with length of _dt. I don't think this is better than what I proposed, or more obvious. Though I agree that even though it is suboptimal, it helps.

verigak commented 8 years ago

You are right, there is a bug in my code, it should be:

`self._dt.append((n + self._pending) / dt)`

And then for the eta:

`return int(ceil(self.avg / self.remaining))`

If we fix this, what do you find suboptimal about this approach?

To walk through your example, let's assume that our unit is 1 byte.

First 0.1 sec we get 10 b/s, which means we advance by 1. Second 0.1 sec we get 20 b/s and we advance by 2. Third 0.1 sec we get 1000 b/s and we advance by 100.

That means that our window will have [1/0.1, 2/0.1, 100/0.1] = [10, 20, 1000] for an average of 343 b/s.

Also:

Your algorithm starts to work once the queue begins to be limited and the slower windows are out from the queue.

but again, this is linear complexity with length of _dt

Yes, this is how Simple Moving Average works. https://en.wikipedia.org/wiki/Moving_average#Simple_moving_average

praiskup commented 8 years ago

I'm not sure I understand. Do you have a concrete patch to review? I don't think eta needs to be touched at all.

praiskup commented 8 years ago

The inefficiency is in the sum() operation at least, which is linear operation (where the variable is overall time window, aka queue length). I proposed constant avg() algorithm and constant next().

praiskup commented 8 years ago

Isn't it truth that waiting for self.time_threshold before pushing this into queue effectively decreases the resolution of the algorithm to 0.1s by default? And if user changes the window, it can be even worse? You should always take the _pending into account while counting avg().

praiskup commented 8 years ago

Because the last 0.1s is the most important time frame for the user..

verigak commented 8 years ago

I'm not sure I understand. Do you have a concrete patch to review?

Please see the latest commit and let me know what you think.

The inefficiency is in the sum() operation at least, which is linear operation (where the variable is overall time window, aka queue length).

So the inefficiency is in adding 10 numbers? N in this case will never be too large because then you are calculating just average. Please do not make performance assumptions without having concrete numbers.

Isn't it truth that waiting for self.time_threshold before pushing this into queue effectively decreases the resolution of the algorithm to 0.1s by default?

Yes, this came out of your suggestion, but of course you can use a small enough value there if you prefer the older behavior.

You should always take the _pending into account while counting avg().

Why is that?

Because the last 0.1s is the most important time frame for the user.

The moving average algorithm does not differentiate between any frame inside the window. If you believe you have a better algorithm than the SMA, then the burden of proving it is on you. But you can not just say that you improved a well established algorithm just based on anecdotal evidence.

praiskup commented 8 years ago

Sending again as github totally breaks formatting of emailed messages. Already reported half a year back.

On Monday 07 of March 2016 23:12:29 Georgios Verigakis wrote:

Please see the latest commit and let me know what you think.

Not at this time, I'll look once the time permits. But seems like this ends up saying 0B/s all the time.. Yes, this is entertaining -- so broken even more. I'm repeating myself -- I was here to do the review -- no need to break the code again in git.

So the inefficiency is in adding 10 numbers? N in this case will never be too large because then you are calculating just average. Please do not make performance assumptions without having concrete numbers.

Consider how many times it is called and 10 is more than 1. More than that, this is configurable parameter (people might choose to use larger window). Look at my patches there is constant (regardless of the window size, it takes the same time) algorithm which do not iterate at all (the queue/list is there just to be able to push_back and pop_front).

effectively decreases the resolution of the algorithm to 0.1s by default?

Yes, because your algorithm always waits 0.1 seconds before it puts the counter _dt. And avg() iss calculated from _dt. Is it clear?

Yes, this came out of your suggestion, but of course you can use a small enough value there if you prefer the older behavior.

I do not want to break user's scenarios, do you? Progress bar should work by default, too.

You should always take the _pending into account while counting avg().

Why is that?

Again: because, at the time of calling avg(), there is lot more data received already, but it is just in _pending, not in _dt.

The moving average algorithm does not differentiate between any frame inside the window. If you believe you have a better algorithm than the SMA, then the burden of proving it is on you. But you can not just say that you improved a well established algorithm just based on anecdotal evidence.

NIH -- the algorithm is so trivial, did you understand the patches I wrote before? Can you tell what is wrong with those and how the algorithm does not match SMA definition :).

verigak commented 8 years ago

I think the best way forward is to use your own fork. Thanks for the feedback.

praiskup commented 8 years ago

Sounds fair. Done. To be fair, I did not change the code yet, so anybody is welcome to contribute and discuss the issue: https://github.com/python-progress/python-progress/pull/1