snivilised / extendio

πŸ‹ extentions to Go standard io library
MIT License
1 stars 0 forks source link

Create a worker pool that supports job streaming #281

Closed plastikfan closed 1 year ago

plastikfan commented 1 year ago

So many of the examples, documentation and other go packages do not support the model where the issuance of jobs to a worker pool can be made in bursts and at anytime up to and including some known end point. A lot of the examples discovered assumes the client knowns the full job stream up front; ie we see examples where a client will create a slice containing all the jobs which are then subsequently dispatched to the pool and the channel immediately closed. These are really noddy examples that don't reflect the complexity of the real world. So we need to roll our own.

The worker pool must have the following features/properties:

Channels in play: πŸ”† jobs (input) πŸ”† results (output) πŸ”† errors (output) πŸ”† cancel (signal) πŸ”† done (signals no more new work)

▢️ ProducerGR(observable):

▢️ PoolGR(workers):

▢️ ConsumerGR(observer):

Both the Producer and the Consumer should be started up immediately as separate GRs, distinct from the main GR.

* ProducerGR(observable) --> owns the job channel and should be free to close it
when no more work is available.

* PoolGR(workers) --> the pool owns the output channels

So, the next question is, how does the pool know when to close the output channels? In theory, this should be when the jobs queue is empty and the current pool of workers is empty. This realisation now makes us discover what the worker is. The worker is effectively a handle to the go routine which is stored in a scoped collection. This collection should probably be a map, who key is a uniquely generated ID (see "github.com/google/uuid"). When the map is empty, we know there are no more workers active to send to the outputs, therefore we can close them.

TODO:

plastikfan commented 1 year ago

This is being spun out into a new package, lorax