Open ghost opened 5 years ago
Unfortunately, as far as I am aware, the progressbar can't detect these exceptions since they don't happen in code managed by the progressbar.
An alternative that does work properly is by using the with
statement. The with
statement was specifically designed for cases like these:
import progressbar
import time
with progressbar.ProgressBar(redirect_stdout=True) as bar:
for i in bar(range(100)):
print('Some text', i)
time.sleep(0.02)
if i == 50:
raise KeyboardInterrupt
I ran with the same problem with stderr and using progressbar.streams.wrap_stderr
. I did not know that using the progress bar as a context manager would prevent this problem (so I created a context manager wrapper myself). I could not find this feature in the doc, I think it would be nice to mention it somewhere, don't you think?
@WoLpH Thanks for pointing out the context manager solution, In my case, I'm handing around a lot of iterators and I'd like to selectively wrap some in progress bars, so the context manager would be inconvenient to use.
I think that the iterator version could also clean up after itself, by ensuring that finished() is called when a general exceptions would stop the iterator.
import progressbar
import time
import types
def __next__(self):
try:
value = next(self._iterable)
if self.start_time is None:
self.start()
else:
self.update(self.value + 1)
return value
except StopIteration:
self.finish()
raise
except Exception:
self.finish(dirty=True)
raise
def __exit__(self, exc_type, exc_value, traceback):
self.finish(dirty=(exc_type is None))
bar = progressbar.bar.ProgressBar(redirect_stdout=True)
bar.__next__ = types.MethodType(__next__, bar)
bar.__exit__ = types.MethodType(__exit__, bar)
for i in bar(range(100)):
print('Some text', i)
time.sleep(0.1)
if i==50:
raise KeyboardInterrupt
Is there a disadvantage to this kind of solution I'm not seeing?
I also overrode the __exit__
method here, which means that when the context manager is being used and the iteration is interrupted, the progress bar doesn't incorrectly show a completed state.
I ran with the same problem with stderr and using
progressbar.streams.wrap_stderr
. I did not know that using the progress bar as a context manager would prevent this problem (so I created a context manager wrapper myself). I could not find this feature in the doc, I think it would be nice to mention it somewhere, don't you think?
It's mentioned in the readme part of the docs: https://progressbar-2.readthedocs.io/en/latest/#context-wrapper And there are a load of examples that use it: https://progressbar-2.readthedocs.io/en/latest/examples.html
I agree that the docs can be improved a lot though. The big problem is that my time is quite limited and this project is not the only project that I maintain: https://pypi.org/user/WoLpH/ If you have suggestions where the docs can be improved, all help is appreciated :)
@WoLpH Thanks for pointing out the context manager solution, In my case, I'm handing around a lot of iterators and I'd like to selectively wrap some in progress bars, so the context manager would be inconvenient to use.
I agree, context managers are often inconvenient to work with and your bar.finish(dirty=True)
is an easier solution in most cases.
I think that the iterator version could also clean up after itself, by ensuring that finished() is called when a general exceptions would stop the iterator.
I fail to see how, but I might be missing something here :)
Is there a disadvantage to this kind of solution I'm not seeing?
Your code still uses the bar.finish(...)
in the except:
part. Without that it still won't work.
I also overrode the
__exit__
method here, which means that when the context manager is being used and the iteration is interrupted, the progress bar doesn't incorrectly show a completed state.
That's indeed a bug that needs to be fixed. Good catch!
Nevermind... all that, thanks to your suggestion I found the answer. I can catch GeneratorExit
. Since it doesn't inherit Exception
I never knew I could catch that from the code.
Thanks for the suggestion, I'll fix it soon :)
I fail to see how, but I might be missing something here :)
For clarity, all I meant by the iterator cleaning up after itself was exactly that it could check for exceptions and ensure finished() is called before quitting. Calling finished() seems to be the best way to ensure that std is unwrapped.
Your code still uses the bar.finish(...) in the except: part. Without that it still won't work.
This is true. I think that this is the right place to put the except block ensuring finished() is always called; inside the next method, rather than around the whole iteration. It means that transparently using the progress bar as an iterator, like in your examples, now works perfectly, even when there's a general exception that stops the iteration.
I'm happy to knock together an PR for these changes, or perhaps do them another way if you have a different preference?
I can catch
GeneratorExit
Ah, that's perfect Again, happy to make the PR, otherwise, thanks very much for your help!
I've added the two changes to the development branch, can you test if that does everything you were looking for? :)
I'm always happy with pull requests if you have more suggestions of course :)
@mmcewen-g not sure if you noticed the update so pinging you again :)
If it works ok I'll create a new release
This doesn't work, but it took me a while to work out why. Apologies for going into this pedagogically, I wrote most of this while I was working out for myself what was going on, and hopefully this sill be a good explanation for people who stumble across this in the future.
We're dealing with Generators, Iterators and Iterables here.
An Iterable is an object with an __iter__()
function, which returns an Iterator.
An Iterator is an object with a __next__()
function, which returns values and eventually raises StopIteration
.
A Generator is weird:
In code, they're any function written with yield
expressions in them. Calling such a function actually returns a 'generator object' rather than whatever the function looks like it might return.
A 'generator object' is a special kind of Iterator; when __next__()
is called, the generator runs the function from where it last left off until it hits the next yield
and returns the yielded value, or until they return, at which point they raise StopIteration
.
They also have additional send()
, throw()
and close()
functions.
Unlike Iterators, Generators can clean up after themselves using the aforementioned close()
.
This works by raising a GeneratorExit
at the last yield
that fired in the generator, which can be caught to do cleanup. close()
is called when a generator object is garbage collected, meaning cleanup is guaranteed.
Iterators in general should not 'hold things' (like locks, open files or IO wrappers), because python gives them no way to clean up. If an iterator does need to hold something, then it should really be a generator, which has the additional 'infrastructure' of close()
to guarantee things get cleaned up.
Currently, ProgressBar
is iterable, meaning it can be used in for...in
loops.
It is also an Iterator; it has a __next__()
method and __iter__()
just returns self
.
It is not a Generator; it has no close()
which is what Python needs to raise a GeneratorExit
inside it. Our attempt to catch GeneratorExit
was doomed from the start.
If we want the progress bar to be able to catch GeneratorExit
and clean up, then instead of __iter__
returning self
it could return a generator.
This generator can catch GeneratorExit
and ensure that finish()
is called.
For a toy example:
import progressbar
import time
import types
class ProgressBarNew(progressbar.bar.ProgressBar):
def __next__():
#remove this from ProgressBar entirely
raise Exception
def __iter__(self):
try:
for value in self._iterable:
if self.start_time is None:
self.start()
else:
self.update(self.value + 1)
yield value
except GeneratorExit:
print("Caught GeneratorExit: Cleaning up")
self.finish(dirty=True)
raise
bar = ProgressBarNew(redirect_stdout=True)
try:
for i in bar(range(100)):
print('Received', i)
time.sleep(0.1)
if i == 10:
raise KeyboardInterrupt
except KeyboardInterrupt:
print("Caught KeyboardInterrupt")
This does work as expected!
Received 0
...
Received 10
Caught GeneratorExit: Cleaning up
10% (10 of 100) |## | Elapsed Time: 0:00:01 ETA: 0:00:09
Caught KeyboardInterrupt
bar(range(100))
returns the ProgressBar with a defined self._iterable
. The for...in
naturally calls iter()
on it, and gets back a generator object, which it iterates over. When the iteration is interrupted, this generator object is garbage collected, GeneratorExit
is caught inside it and finish()
is called. Meanwhile, the ProgressBar object is left alone, and may be inspected by the user in code afterwards that catches the exception.
There is one downside of this, which is that the iteration in the ProgressBar
is now split over two objects; the ProgressBar
holds the state of the current iteration and the generator references the ProgressBar
to update the state during iteration (including cleaning up). This has the advantage that the state of the progress bar is easy to reference inside the iteration loop, but I expect that's not the typical use case for then the progress bar is simply used to wrap an iterable in a for...in
statement.
It might be worth discussing moving the state for the progress bar into the generator object, and leaving ProgressBar
as a traditional iterable that returns logically independent generators. This kind of refactoring would also have the advantage that the ProgressBar
could be made reusable without needing any calls to init()
, as it would simply return a new generator when it was reused.
Thank you for writing this pedagogically!
I realized that we could do a less invasive job by promoting the ProgressBar itself to being a Generator. This is a much smaller change, as a Generator is just a special type of Iterator, which the ProgressBar is already. It also means the current examples and code usage can stay identical, and that we aren't introducing any new objects.
It took me a little while to figure out how to arrange for python to run close
on the ProgressBar - Python uses __del__
when the object is garbage collected, and generator objects have a call to close
in their __del__
. The abstract class for Generators doesn't include a __del__
, so it has to be included manually.
With that, whenever the progress bar is garbage collected, it makes a direct call to close
and that runs finish
, which guarantees that stdout is unwrapped no matter what stops the ProgressBar. Now the only way stdout can remain wrapped is if the user is holding a reference to the ProgressBar that owns the wrapper, in which case they're able to call finish
themselves if they want to.
I realized that we could do a less invasive job by promoting the ProgressBar itself to being a Generator. This is a much smaller change, as a Generator is just a special type of Iterator, which the ProgressBar is already. It also means the current examples and code usage can stay identical, and that we aren't introducing any new objects.
This is what I eventually did in my project using ProgressBar:
def progress_bar(iterator, *args, **kwargs):
with progressbar.ProgressBar(*args, **kwargs) as progress:
for item in progress(iterator):
yield item
Which is very close to what progressbar.shortcuts.progressbar
does.
The above (and the shortcut) both do something similar but not identical to the solution in the PR.
When python runs the generator function, it returns a new generator object (which is a Generator capital G), meaning that there are now two objects in memory: the ProgressBar
and the generator object
python made when the generator function was called.
def progress_bar(iterator, *args, **kwargs):
with progressbar.ProgressBar(*args, **kwargs) as progress:
for item in progress(iterator):
yield item
bar = progress_bar(range(100))
type(bar) # generator object, not ProgressBar
This is different to the solution in the linked PR, which is to make the ProgressBar
itself into a generator object; now there is only one object in memory. This doesn't make a huge difference, but it is a little more elegant and makes the stack trace cleaner if the iteration is interrupted.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
When redirect_stdout is beign used, and the iterator is interrupted, for example by a KeyboardInterrupt, stdout remains wrapped,
Example code
You'll end up with the 50% completed progress bar sitting on your terminal and refusing to leave.
When an exception stops the progress of the iterator, the progress bar doesn't take notice and unwrap stdout, as it does when finished() is called.
Manually wrapping the iteration in a try:except: works, but obviously isn't very nice
Versions
Python 3.7.3, IPython 7.8.0, Linux, progressbar version 3.47.0