Closed grosser closed 1 year ago
@agrare I don't know if you have been following the Kubeclient::Informer class on master branch, but perhaps it's interesting for future use in manageiq (?). Either way, you may have useful experience to share here from how manageiq is killing & restarting watches... That's a long way of saying "please review" :wink:
[P.S. entirely out-of-scope here, but there is the unfinished business of #488. The blocker there was we didn't know how to implement watch .finish
with Faraday :shrug: @agrare I was considering maybe dropping .finish
in 5.0 and letting you figure out how to bring it back when you want to upgrade ManageIQ :stuck_out_tongue_closed_eyes: But now I realize Informer
relies on it deeply, and I suspect so would any advanced usage of watches?]
I don't know if you have been following the Kubeclient::Informer class on master branch, but perhaps it's interesting for future use in manageiq (?)
Hey @cben! No I hadn't seen this, it definitely looks intriguing though we try not to cache inventory to keep our memory usage down so I doubt we'd use it directly but I can definitely see how this would be useful if your code were only interacting with kubeclient.
There definitely could be a more user friendly interface around watches since it seems we are solving the same issues.
Either way, you may have useful experience to share here from how manageiq is killing & restarting watches... That's a long way of saying "please review"
I notice a few minor differences in how we handle this compared to the Informer
class:
resource_version: '0'
to the get_
nil
and restart the watch.Concurrent::AtomicBoolean
to indicate to the watch thread if it should break out after the watch.finish
or retryI believe we pulled this logic from the kubevirt/cnv ManageIQ provider which was contributed by the kubevirt team. Not saying this is better or worse than the Informer implementation just noting the differences.
But now I realize Informer relies on it deeply, and I suspect so would any advanced usage of watches?]
Yes we do use #finish
to get a watch to break out so that we can join cleanly but we join with a timeout and .kill
the thread if it doesn't respond. We also aren't trying to keep a cache consistent from this thread though.
What is the purpose of the waiter thread? I looks like it'll stop the watch after 15 minutes?
I think this is a good step forward, so prefer to merge and then iterate further if there are still open issues.
Sorry, missed your reply entirely.
FWIW, Thread.pass
is advisory, so not sure it can really "fix" things:
Give the thread scheduler a hint to pass execution to another thread. A running thread may or may not switch, it depends on OS and processor.
I'm still somewhat concerned a call to stop_worker
might get stuck for a long time.
But yes, previous behavior was chaotic, so I guess it's an advance. :+1:
And I'm certainly happy about CI looking reliably green now :tada: Merging.
making all the threads terminate and no longer killing things ... 3 green ci runs 🤞