Closed DrJosh9000 closed 2 months ago
@DrJosh9000 would it be better to abstract this a little, rather than interacting with the scheduling channel ect directly in this code?
@wolfeidau sounds good. I tweaked the tests a bit and discovered an edge case I had forgotten, that this would have helped with. So we now have tryTakeToken
and tryReturnToken
.
What
Basically rewrite the whole
MaxInFlightLimiter
.Relevant to #302 and might fix it
Why
Despite
Create
taking a context parameter, under the hood,add
could block while waiting forcompletions
(async.Cond
) to wake the goroutine. This is a consequence ofsync.Cond
itself not being context-aware. At minimum, after detecting context cancellation,completions.Broadcast()
would need to be added (somewhere), and a check for context added to thecompletions.Wait()
loop. But...The mutex locking was probably overly coarse-grained. In
add
it somewhat makes sense to guard most of the method, since it needs to wrapcompletions.Wait()
. Everywhere else the mutex was really only guarding the map tracking which jobs were in-flight. Reads on the map happened at the start of methods and writes towards the end of functions. But looking at the purpose of these reads and writes, the informer callbacks (OnAdd, OnDelete
) exist to make "the model" more closely match reality when out of sync with it (i.e. track jobs that were started by a previous incarnation of the controller), and onceadd
has decided that the job hasn't already been started, then it effectively has ownership of starting that job.Finally, despite defining
mu
to be async.RWMutex
, nothing (not even thesync.Cond
or theInFlight
method) usedRLock
/RUnlock
, so it could have been a plainsync.Mutex
.So I decided some rethinking of the limiter mechanisms was in order.
How
sync.Mutex
and guards map accesses in a fine-grained manner, through a compare-and-swap method.completions sync.Cond
is gone, replaced with atokenBucket
buffered channel.Create
now blocks on a channel read until a job completion puts a token back in the bucket.Create
can now select between the channel and the context, thus returning early if the context is cancelled.