oar-team / oar

OAR is a versatile resource and task manager (also called a batch scheduler) for clusters and other computing infrastructures.
http://oar.imag.fr/
GNU General Public License v2.0
44 stars 23 forks source link

in deploy/cosystem jobs, oarsub interactive connections are not termiated when the job is killed #172

Closed npf closed 5 years ago

npf commented 5 years ago

whenever the job is killed by oardel or when the walltime is reached, oarexec does not kill the oarsub interactive shell processes (oarsub -I or oarsub -C)

The oarexec debug message is (/var/lib/oar/oar.log on the frontend/first node, when OAREXEC_DEBUG is activated):

[oarexec 187718] want to kill oarsub connections  BUT it seems that there is no OAR oarsub processes

It is actually also the case for regular (non-deploy or cosystem) jobs, but the oarsub processes are in this case killed by the job_resource_manager (cpuset clean-up).