Closed style95 closed 1 year ago
Merging #5391 (c1c8ca1) into master (60ca660) will decrease coverage by
0.27%
. The diff coverage is100.00%
.:exclamation: Current head c1c8ca1 differs from pull request most recent head 3e99442. Consider uploading reports for the commit 3e99442 to get more accurate results
@@ Coverage Diff @@
## master #5391 +/- ##
==========================================
- Coverage 76.91% 76.64% -0.27%
==========================================
Files 240 240
Lines 14588 14595 +7
Branches 629 636 +7
==========================================
- Hits 11221 11187 -34
- Misses 3367 3408 +41
Impacted Files | Coverage Δ | |
---|---|---|
...ntainerpool/v2/FunctionPullingContainerProxy.scala | 78.65% <100.00%> (+0.17%) |
:arrow_up: |
...e/openwhisk/core/scheduler/queue/MemoryQueue.scala | 80.99% <100.00%> (-1.42%) |
:arrow_down: |
... and 11 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
This LGTM we need this as well but I think it's important to have the additional safeguard from the state timeout never occurring in the removing state creating an activation cycle such that the queue is never shut down so will still go ahead with merging that as well if you approve it.
The edge case also occurs from this event
case Event(StateTimeout, _: NoActors) =>
logging.info(this, s"[$invocationNamespace:$action:$stateName] The queue is timed out, stop the queue.")
cleanUpDataAndGotoRemoved()
which this change wouldn't cover.
I think unit tests will need to be updated to make sure these changes don't have any side effects where sending queue removed to the parent occurs too early in the removal process
@bdoyle0182 My bad, I added another missing part.
So both cleanUpDataAndGotoRemoved()
and cleanUpActorsAndGotoRemoved()
should send a queue removed message to the queue manager.
Since this will be called from every shutdown case, I just want to make sure that there aren't any new race conditions created. Would it be possible for the queue removed message to be sent twice if it's called from here now and it was sent from another shutdown case such that the queue is removed from the manager map and then a new queue is created and this fsm sends another queue removed message removing it from the parent map after the new queue is created? I'm not sure if that could happen or matters, but want to check
Will approve this so long as we're confident sending queue removed to the parent from these locations cannot create a new race condition.
Once the queue manager receives a queue removed message, it will just remove the corresponding reference from queue pool and send an ack back to the sender. If there is any logic to re-add the reference to the queue pool then there might be a race condition, but it always removes the reference and no race condition happens. https://github.com/apache/openwhisk/blob/master/core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/QueueManager.scala#L204
One concern could be multiple acks sent from the queue manager, but it will also be handled by the wildcard case if there is no match for a certain state. https://github.com/apache/openwhisk/blob/master/core/scheduler/src/main/scala/org/apache/openwhisk/core/scheduler/queue/MemoryQueue.scala#L670
LGTM
This is to handle the issue reported in https://github.com/apache/openwhisk/pull/5388 in another way.
Description
Related issue and scope
My changes affect the following components
Types of changes
Checklist: