Closed oleweidner closed 11 years ago
This is related I think: https://github.com/saga-project/BigJob/issues/121
Hi Ole-
I might need help with this ticket. I looked in the code and the comments say - which makes me think the PJs should be canceled too:
def cancel(self):
""" Cancel the PilotComputeService.
This also cancels all the PilotJobs that were under control of this PJS.
Keyword arguments:
None
Return value:
Result of operation
"""
for i in self.pilot_computes:
i.cancel()
The i.cancel() is then calling this on each PilotJob:
def cancel(self):
""" Terminates the pilot """
self.__bigjob.cancel()
And __bigjob.cancel() looks like it should do the trick?:
def cancel(self):
""" duck typing for cancel of saga.cpr.job and saga.job.job """
logger.debug("Cancel Pilot Job")
try:
if self.url.scheme.startswith("condor")==False:
self.job.cancel()
else:
pass
#logger.debug("Output files are being transfered to file: outpt.tar.gz. Please wait until transfer is complete.")
except:
pass
#traceback.print_stack()
logger.debug("Cancel Job Service")
try:
if not self._pool.del_value (self.js) :
del (self.js)
self.js = None
except:
pass
#traceback.print_stack()
try:
self._stop_pilot_job()
logger.debug("delete pilot job: " + str(self.pilot_url))
if _CLEANUP:
self.coordination.delete_pilot(self.pilot_url)
#os.remove(os.path.join("/tmp", "bootstrap-"+str(self.uuid)))
except:
pass
#traceback.print_stack()
logger.debug("Cancel Pilot Job finished")
Is this ticket still valid? I thought we fixed this in BigJob?
I think this has been resolved. Did it pop up again?
No - Melissa just stumbled over the ticket when checking something for Matteo, and we wondered why it was still open...
Then close it… ;-)
On Aug 27, 2013, at 22:29 , Andre Merzky notifications@github.com wrote:
No - Melissa just stumbled over the ticket when checking something for Matteo, and we wondered why it was still open...
— Reply to this email directly or view it on GitHub.
Bang bang!
This script https://github.com/saga-project/BigJob/blob/develop-prod/tests/test_connection_pooling.py leaves a lot of zombies behind on repex2 where it is integrated with Jenkins:
As per Melissa's suggestion, I use
pilot_service.cancel()
at the end of the script. But this doesn't seem to cancel the individual pilots / agents. Do I need to cancel them individually, e.g.,I could obviously put this back into the code, however, I think that PJs should get canceled implicitly if you cancel their 'parent' service?