Closed 20wildmanj closed 1 year ago
While this will certainly prevent exceptions, is it actually the right fix? This looks like another variant of #207
Yes we have looked at #207, though because my exposure to Tango has only been through hotfixes the past week I'm not 100% in terms of understanding the states of Tango. There's definitely a better solution, but making it happen right now may be difficult in terms of time.
I'm also concerned because the exception I encountered was in assignJob, which only has one callsite. How was jobQueue.assignJob called on a job that was not in the unassigned queue?
The jobQueue.assignJob replaced job.makeAssigned() in __manage in #227 to fix a bug where the vm was never getting set, leading to issues in our autolab-docker installation.
Upon further review of the code
Thanks for looking into this, I've removed the removal from the unassigned queue in assignJob
.
Closing this PR for now, will return to this issue with a better fix later.
Description
Fixes issue found in Tango deployments that use the default Python thread-safe queue instead of
TangoRemoteQueue
, where it was possible forremove
to be called to remove an element already removed from a queue, causing an unhandled exception.This hotfix merely will check to see if the element exists in the queue before calling the
remove
method, which is consistent withTangoRemoteQueue
, which does not raise an exception in the same scenario as well.Testing
USE_REDIS=False
inconfig.py
python -m unittest tests/testJobQueue.py
, see all tests run successfully