vbauerster / mpb

multi progress bar for Go cli applications
The Unlicense
2.29k stars 123 forks source link

WithWaitGroup hangs on error. Help explain how to use WithWaitGroup. #130

Open kalensk opened 1 year ago

kalensk commented 1 year ago

Can you help explain how to properly use WithWaitGroup() especially when an error occurs? I seem to be having some misunderstanding.

The general pattern without using a progress bar which works is:

To use the mpb library I used mpb.New(mpb.WithWaitGroup(&wg)) and called progressBar.Wait() instead of wg.Wait(). See below code. However, if the DoWork() errors such as before bar.Increment() is called the progressBar.Wait() never returns. Adding a bar.Abort() when an error occurs seems to work, but is a pattern that seems incorrect and one I'd like to avoid.

Note: I have also tried the below code without a WaitGroup and just progressBar := mpb.New() and it still hangs on progressBar.Wait() unless I add a bar.Abort() if an error is returned from DoWork, which does not seem correct. I am not sure I understand the point of WithWaitGroup if the below code works the same without needing it. Can you explain?

Thank you for any clarification and help in understanding!


var wg sync.WaitGroup
progressBar := mpb.New(mpb.WithWaitGroup(&wg))
errChan := make(chan error, len(databases))

for _, database := range databases {
    wg.Add(1)

    bar := progressBar.New(int64(database.NumScriptsToDoWork),
        mpb.NopStyle(),
        mpb.PrependDecorators(decor.Name(database.Datname, decor.WCSyncSpaceR)),
        mpb.AppendDecorators(decor.NewPercentage()),
    )

    go func(database DatabaseInfo, bar *mpb.Bar) {
        defer wg.Done()

        err = DoWork(database, bar)  // calls bar.Increment()
        if err != nil {
            errChan <- err
            return
        }

    }(database, bar)
}

progressBar.Wait()
close(errChan)

// ...
vbauerster commented 1 year ago

I am not sure I understand the point of WithWaitGroup if the below code works the same without needing it. Can you explain?

When you apply mpb.WithWaitGroup(&wg) it means wait for supplied wait group first and then wait for all bars to complete or abort. If you don't use this option which is totally ok then you'll end up with:

wg.Wait() // wait for range databases loop
progress.Wait() // wait for bars to complete or abort

the point of WithWaitGroup is to have single point of Wait call and wait/sync between different goroutines (range databases loop and bars rendering loop run in different goroutines). In other words if range databases loop has completed it doesn't mean that bars rendering loop completed as well and vise versa. There may be surprising side effects if forgetting to wait either, for example if you forget to wait for progress and you program ends right after wg.Wait() then bars may end up with incomplete state like showing 98%.

Adding a bar.Abort() when an error occurs seems to work, but is a pattern that seems incorrect and one I'd like to avoid. I have also tried the below code without a WaitGroup and just progressBar := mpb.New() and it still hangs on progressBar.Wait() unless I add a bar.Abort() if an error is returned from DoWork, which does not seem correct.

If bar doesn't complete or abort then progress.Wait() will never release. You already answered how to fix: just call bar.Abort() in error case. Why it seems incorrect to you?

vbauerster commented 1 year ago

Following is not related to your question just some little error handling review. Do you really need to handle all possible errors? Usually it's enough to handle first error and ignore the rest:

var wg sync.WaitGroup
progressBar := mpb.New(mpb.WithWaitGroup(&wg))
errChan := make(chan error, 1) // we will handle first error only

for _, database := range databases {
    wg.Add(1)

    bar := progressBar.New(int64(database.NumScriptsToDoWork),
        mpb.NopStyle(),
        mpb.PrependDecorators(decor.Name(database.Datname, decor.WCSyncSpaceR)),
        mpb.AppendDecorators(decor.NewPercentage()),
    )

    go func(database DatabaseInfo, bar *mpb.Bar) {
        defer wg.Done()

        err = DoWork(database, bar)  // calls bar.Increment()
        if err != nil {
                        select {
               case errChan <- err:
                           // we're the first goroutine to fail here
                           default:
                           // don't care as error already happened/sent
                        }
                        bar.Abort(...)
        }

    }(database, bar)
}

progressBar.Wait()
close(errChan)
// do something with error if any
if err := <-errChan; err != nil {
 ...
}
kalensk commented 1 year ago

Thank you so much for the response! I really appreciate it as it helps clarify some concepts for me.

the point of WithWaitGroup is to have single point of Wait call and wait/sync between different goroutines (range databases loop and bars rendering loop run in different goroutines). I

Interesting. So, WithWaitGroup basically avoids the following:

wg.Wait() // wait for range databases loop
progress.Wait() // wait for bars to complete or abort

I tend to prefer code that is very explicit and clear. Thus, having a waitgroup (or whatever is needed) for "both" the database work being done, "and" for the progress bars rather than potentially hiding any logic behind something. To me having it for both helps make things clear and explicit to mitigate bugs and logic errors.

Why it seems incorrect to you?

Because, in my "normal" workflow as show in my original example there is no need for "aborting". That is,

It is using the general concept of returning errors from DoWork() via channels since we are using goroutines. There is no concept of "aborting" or needing to call an "abort" function. That is what I mean by seems incorrect or might indicate a "code smell".

Following is not related to your question just some little error handling review. Do you really need to handle all possible errors? Usually it's enough to handle first error and ignore the rest:

If I am following correctly, I think it mainly comes down to if one wants to return early at the first error, or collect and return all errors. That is, by returning on the first error one ignores any additional errors.