Closed amaltaro closed 10 years ago
When you change the state of a site, the condor plugin tries to update the list of sites where each job can run. There is 2 classAd that it checks: ExtDESIRED_Sites (where the job can run) and DESIRED_Sites (where the job will run). Basically it updates the list of sites in DESIRED_Sites, if the site is moved to Down, Drain or Aborted (exclude=True passed), remove it from that list, if it is the only site then append the job to a list of jobs to kill. When the site is moved to Normal (exclude=False), then I dont understand why are we doing this: https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/BossAir/Plugins/PyCondorPlugin.py#L742 To me it sounds like it should be: siteName not in desiredSites and siteName in extDesiredSites then append the site to DESIRED_Sites list (append the site name to the list where the job will run if it was removed before)... The ERROR will basically be logged for every job where the site is not in the desiredSites lists @tsarangi Does it makes sense?
@lucacopa BUG, BUG, BUG... I can see that... :-)... Go ahead for the fix...
For the record, since I have no time to look at this issue now. We had CNAF in the RC DB as Down (but normal tasks/thresholds), then when you move it back to Normal, it throws errors.
In the end, the operation is properly performed and the site is in Normal state. Agent version was 0.9.95b + patches.