Open insatomcat opened 2 weeks ago
Hello, the amount of SSH connections you do depends on how many worker you launch the app with.
Currently every worker has or creates its own tmp
file inside the slurm_workspace
and each one handles its own studies launch.
There's a loop in the code, method _loop
inside slurm_launcher.py
for the worker to ask slurm the state of the running job.
The loop executes itself every 2 seconds and does only one SSH connection (I believe) so I don't really know why you have so much connections.
Also I think that it never stop is a bug as the method stop()
inside the same file is supposed to stop the loop.
@sylvlecl if you have an explanation feel free
Description
When using a slurm launcher, I can see antarest doing a lot of ssh connections to slurm (about 10-12 / second). They start with the first launch of a study, and never stop unless I shut down the antarest container.
That leads to 2 questions:
If this is specific to my setup, any advice on how I should debug this?
Thanks.