piotrmurach / tty-progressbar

Display a single or multiple progress bars in the terminal.
https://ttytoolkit.org
MIT License
421 stars 24 forks source link

ProgressBar#iterate vs LazyEnumerator #23

Closed zverok closed 6 years ago

zverok commented 6 years ago

I am not sure whether it is a frequent use case, but I've thought that iterate is exact match for it... But it isn't.

Imagine this:

sleepy = Enumerator.new { |y| 
  loop do
    do_something
    sleep(10)
  end
}
# Now, each call of sleepy.next does something, then sleeps. And it is forever!
# The reasonable usage is like:
partial_result = sleepy.take(10)

# Now, I expected that this is "natural":
partial_result = TTY::ProgressBar.new('[:bar:]').iterate(sleepy).to_a

...but, as iterate tries to get collection's count, everything goes sideways here. Probably, having an optional total_count argument could be useful?

piotrmurach commented 6 years ago

Hey Victor,

Thanks for using this gem!

In all honesty, this is the first time anyone needed this kind of functionality. The iterator accepts a collection and internally creates an Enumerator based on collection size. I'm kind of struggling to think of a need for infinite enumeration but I'm not against the idea. I guess it could take into account the total provided at the initialization stage or as a keyword argument:

bar = TTY::ProgressBar.new('[:bar]', total: 30)
bar.iterate(sleepy)

Do you want to submit PR to discuss the idea further? Ideally it would work with the current iterator interface rather than add new api method call.

zverok commented 6 years ago

I'm kind of struggling to think of a need for infinite enumeration but I'm not against the idea.

The essence of my use-case (I invented it recently and still not sure it is a good thing... but I like it at the moment) is not infinity, but rather "each step takes long" processes. Here is probably more reasonable example:

processing = TEN_THOUSAND_FILENAMES.lazy
  .map { |very_large_file| CSV.read(very_large_file) }
  .select { ... }
  .map { |data| some_really_complicated_processing(data) }
  .map { |row| database.store(row) }

TTY::ProgressBar
  .new('[:bar]', total: TEN_THOUSAND_FILENAMES.count)
  .iterate(processing).to_a

How is this better than bar.iterate(filenames) { |name| do all the processing }? I believe because of concern separation: some logical processor knows how to process (but doesn't give a damn about progress-bar or any other UI, but it is progress-bar friendly by returning "processing enumerator", but the solution is generic and can be used in a lot of other ways), and UI/CLI doesn't need to know a thing about how the processing is implemented, just knows that "it is enumerable, and each iteration = step of a progress-bar". When I thought about it, it all clicked well in my head, and I even thought that #iterate is exactly to support this case :)

I'll provide a PR on the next week (leaving to a short vacation now).

piotrmurach commented 6 years ago

I'm convinced by your example and reasoning.

Enjoy your break and looking forward to your PR!

piotrmurach commented 6 years ago

I have found couple issues with iterator, one was the fact that infinite lazy enumerator would never finish even if the progression was done. That's to me a weakness, since most users should be saved from such scenarios. I've also fixed the fact that total calculation didn't take into account the progress value. I've added example to demonstrate the changes. Looking for your feedback!

zverok commented 6 years ago

Looking at it.

zverok commented 6 years ago

TBH, I feel very uneasy about this change: https://github.com/piotrmurach/tty-progressbar/commit/54c2498bf0387cdb8b061559eec50a0161306a36#diff-54ab81a7c5c641c09a339c3765cd1855R179

For me, it mixes "data" and "representation" layers of the logic. If progress bar provides support for arbitrary iterators (that was my initial goal), it should definitely NOT decide where to stop the iteration. One kinda-real-world example (in pseudocode):

headers = HTTPClient.head(url)
chunks = headers['Content-Size'].to_f / CHUNK_SIZE
current_chunk = 0
downloader = Enumerator.new do |y|
  loop do
    res = HTTPClient.get(url, offset: chunk*CHUNK_SIZE, limit: CHUNK_SIZE)
    yield(res)
    raise StopIteration if res.eof?
    chunk += 1
  end
end
data = ProgressBar.new(total: chunks).iterate(downloader).to_a

The idea here is: Content-Size header (or other means of knowing "how much should we download") can be wrong, and we may need MORE iterations than expected. If at this point progress-bar just ceases updating (or shows 110%) it is OK-ish, but if it stops the algorithm by its own will because it has decided "the progress is complete", it is a big no-no for me. It is just a representation level, I don't want it to do anything with my data!

If I passed the "infinite" enumerator and tried to just map it to something, it is MY problem, and it is exactly the same as without ProgressBar, so if I am using infinite enumerators I probably have thought about it already.

In fact, thinking a bit more about it, the "true Ruby Enumerators-spirit" progress bar probably should not "know" about data at all (even in "pass any Enumerable to #iterate" way). I imagine this:

downloader.zip(ProgressBar.new(total: chunks)).map { |data, bar| data }

E.g. ProgressBar#each is just implemented and that is it. It looks now pretty consistent to me, and we even can do something like...

downloader.zip(ProgressBar.new(total: chunks)).map { |data, bar| 
  bar.update(title: 'Still downloading, progress unpredicatable...') if bar.complete?
  data
}
piotrmurach commented 6 years ago

Thanks for your feedback, appreciate it! My preference is always to discuss code and implementation and hence I went ahead and done it. However, your argumentation convinced me to revert back to allow for infinite enumerators.

I'm happy with the way #iterator method works. Would you have time to add to the readme how this new implementation can be used with lazy enumerators since this was your idea to expand it in the first place?

zverok commented 6 years ago

Would you have time to add to the readme how this new implementation can be used with lazy enumerators since this was your idea to expand it in the first place?

Yeah thanks will do.