alitto / pond

🔘 Minimalistic and High-performance goroutine worker pool written in Go
MIT License
1.5k stars 65 forks source link

Why do we need to verify runningWorkerCount when submitting a task? #6

Closed chengyayu closed 3 years ago

chengyayu commented 3 years ago

hi @alitto pond is great. Thank you for your effort. I am studying the code of pond trying to figure out its design structure, the verify of runningWorkerCount > 0 && p.Idle() > 0 in the following piece of code causes my confusion.

func (p *WorkerPool) submit(task func(), waitForIdle bool) bool {
    if task == nil {
        return false
    }

    runningWorkerCount := p.Running()

    // Attempt to dispatch to an idle worker without blocking
    if runningWorkerCount > 0 && p.Idle() > 0 {
        select {
        case p.tasks <- task:
            return true
        default:
            // No idle worker available, continue
        }
    }

    maxWorkersReached := runningWorkerCount >= p.maxWorkers

    // Exit if we have reached the max. number of workers and can't wait for an idle worker
    if maxWorkersReached && !waitForIdle {
        return false
    }

    // Start a worker as long as we haven't reached the limit
    if ok := p.maybeStartWorker(task); ok {
        return true
    }

    // Submit the task to the tasks channel and wait for it to be picked up by a worker
    p.tasks <- task
    return true
}

What's the problem if we replace runningWorkerCount > 0 && p.Idle() > 0 with p.Idle() > 0?

alitto commented 3 years ago

Hey @chengyayu! thank you 🙂 We need to check the worker count using the workerCount counter because idleWorkerCount is not as reliable. If we only check whether p.Idle() > 0, then it could happen that we submit a task to the tasks channel right before this line https://github.com/alitto/pond/blob/master/pond.go#L331, where idleWorkerCount is decremented. This can cause the task to wait in the tasks channel forever (if no other tasks are submitted afterwards). workerCount, on the other hand, is updated in a synchronized fashion using a mutex, and that, along with checking the idle counter > 0, guarantees there is at least 1 running goroutine. This goroutine may or may not be really idle, but that's fine because we just need to make sure the task will eventually be picked up.

chengyayu commented 3 years ago

Hey @alitto thank you for your reply

We need to check the worker count using the workerCount counter because idleWorkerCount is not as reliable. If we only check whether p.Idle() > 0, then it could happen that we submit a task to the tasks channel right before this line https://github.com/alitto/pond/blob/master/pond.go#L331, where idleWorkerCount is decremented.

If we check runningWorkerCount>0 && p.Idle()>0, it could also happen that we submit a task to the tasks channel right before this line https://github.com/alitto/pond/blob/master/pond.go#L331, where idleWorkerCount is decremented. and at that moment, the runningWorkerCount may be 1, am i right ?

This can cause the task to wait in the tasks channel forever (if no other tasks are submitted afterwards). But i do not know how to prove it.