Closed infraweavers closed 1 year ago
Yep that works fine with the shallow copy; just given it a good 25 mins and no problem so far
Happy to squash the history or whatever if you want but this seems to have solved the problem for us
great, thanks for analysing and fixing the problem :-)
just a small enhancement: 65c64c85ad5431528d19a11e8fe016e4b10e47f8 but since the enqueueDupServerResult is called every time, even without dupserver configured, it should return early and not create a useless copy every time.
Cool, I assume this will make its way into OMD nightly at somepoint(presumably after the next mod-gearman-worker-go release)? Is there a way for us to know when it is?
the go worker will always be updated in the nightly omd. So it should be part of the rpm/deb tomorrow.
Hmm! Just checked this morning and it's still broken; mod_gearman_worker-go --version
shows:
mod_gearman_worker - version 1.3.0 (Build: 5.01.20230206-labs-edition-v1.3.0)
We'll check again tomorrow, as I'm guessing it may not have pulled in the change due to when it was committed or whatever, not sure if that fits with your expectations etc
yeah, build this morning is now showing as:
mod_gearman_worker - version 1.3.0 (Build: 5.01.20230207-labs-edition-539bd34)
And seems to be working fine (in this regard anyway), looks like mod_gearman_worker-go has been renamed and stuff
On 08.02.23 11:16, Codeweavers Infrastructure wrote:
yeah, build this morning is now showing as:
|mod_gearman_worker - version 1.3.0 (Build: 5.01.20230207-labs-edition-539bd34) |
And seems to be working fine (in this regard anyway), looks like mod_gearman_worker-go has been renamed and stuff
That's correct, i finally remove the c worker.
Duplicating the Result object means that we have one for the normal result path and another copy for the dupserver path.
Confirmed by building this our development environment and running it for 30 minutes without any Orphaned host checks or similiar (normally shows within 10 mins or so) and comparing against master@e908b77 (which does show the problem).
Previously there was a race condition between these two so that if
dupserver
is configured there was a chance that the passive "normal" results would be sent down the normal path. This results in mod_gearman not seeing the response as the right response for the active check that it submitted to the queue (presumably because it's now marked as passive).