Open woestler opened 1 year ago
Hi, I am also facing the same issue. Can someone please support me on this? My understanding is dask-gateway sets the environment variable for the location of dask.crt which is the staging location but it never copies the dask.crt to that location.
@woestler - Did you resolve this? @TomAugspurger, @jacobtomlinson, @martindurant - Any support will be much appreciated
Could be related: https://github.com/dask/distributed/issues/4617
When I created a cluster on HPC using Slurm and dask-gateway-server, I encountered a problem. My understanding of the running process is as follows: when dask-gateway-server receives the new_cluster command from the client, it converts the command into an
sbatch
command. I have edited thedask_gateway_server/backends/jobqueue/slurm.py
file and print the variables cmd, env, and script inget_submit_cmd_env_stdin
, the output are as follows:cmd
env
script
When the Slurm node receives this command and begins execution, if the non-edge node receives the Slurm Job, it will try to find the dask.crt and dask.pem files that appear in the environment variables above, but these files do not exist on this node. The Slurm task will fail and the error message is as follows:
@jcrist @consideRatio @TomAugspurger @jacobtomlinson @martindurant