rsalmei / alive-progress

A new kind of Progress Bar, with real-time throughput, ETA, and very cool animations!
MIT License
5.53k stars 206 forks source link

Progress bar for a multi-threaded function #75

Closed miladsade96 closed 3 years ago

miladsade96 commented 3 years ago

Hi there, I have a encoder function that gets an image as a parameter and returns its encode value. I have so many images to encode and decided to use ThreadPoolexecutor and its map function to speed up encoding process. My question is : How can i use these alive progress bars to track my function progress status?

rsalmei commented 3 years ago

Hello!

Nice, that is a cool use case! Maybe I should even include this example in the README!

It's actually pretty easy! Just create the alive_bar handle and the executor in the same with block, then you can iterate on the results of the map! You could even zip the original input with the map, and use them together like this:

from alive_progress import alive_bar
from concurrent.futures import ThreadPoolExecutor

images = []  # detect and retrieve the work to be done.
with alive_bar(len(images)) as bar, ThreadPoolExecutor() as executor:
    for image, result in zip(images, executor.map(encode_func, images)):
        bar()
        print(f'image done: {image} -> result: {result}')

The bar will pop up and nicely track you parallel progress! Tip: if this encoding is CPU intensive, you could try the same with ProcessPoolExecutor! It's not exactly the same, as both the inputs and outputs have to be pickle serializable, but you would serious performance improvement, since Python can only utilize more than one core this way. 👍

miladsade96 commented 3 years ago

Hi @rsalmei I have tested your snippet mentioned above, but progress bar is shown after encoding process.

rsalmei commented 3 years ago

Hey, maybe you missed some detail, because it does work. I've completed the example with some test code, and here it is a full working example:

https://user-images.githubusercontent.com/6652853/104954714-8888a880-59a7-11eb-85be-5e0538217c60.mov

miladsade96 commented 3 years ago

This is my encoder function:

def encoder(image):
    """
    Find encodings for each image located in the given list.
    :param image: loaded image
    :return: encode value
    """
    # convert BGR to RGB
    img = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
    encode = fr.face_encodings(img)[0]
    return encode

and this one is with statement:

            with ProcessPoolExecutor() as executor:
                known_faces_encodes = executor.map(encoder, images_list)
            known_faces_encodes = list(known_faces_encodes)
            print("Encoding process completed!")
rsalmei commented 3 years ago

Ok, like I said, enter the with context of both at the same time. Then you need to iterate this known_faces_encodes of yours, calling bar in there.

Something like this:

with alive_bar(len(images_list)) as bar, ProcessPoolExecutor() as executor:
    known_faces_encodes = []
    for enc in executor.map(encoder, images_list):
        bar()
        known_faces_encodes.append(enc)
print("Encoding process completed!")
rsalmei commented 3 years ago

Just another cool tip!! The map method has to return results in order, so when the next one is taking longer, it will hold all the others on queue. That way you won't be able to see the actual processing speed!

You can make a simple change to get results in any order! Just use submit instead of map, and iterate results with as_completed!

Look at the difference, it was perceptible my first example was bumpy, because of this holding. This one is much more fluid:

https://user-images.githubusercontent.com/6652853/104964547-5da84f80-59bb-11eb-936e-97291f797774.mov

miladsade96 commented 3 years ago

Thank you @rsalmei