Open davide-q opened 1 year ago
Hi @davide-q, thanks for raising this issue.
If one doesn't specify a network interface, the default one is used. I think in the end this comes from https://github.com/dask/distributed/blob/0063de53fed5e4e2e409940213c6265867e6635d/distributed/utils.py#L157. Usually it will be the default first ethernet interface.
There are two ways to specify arguments like that, either through code, either through yaml configuration file. This is true for all kwargs, see https://jobqueue.dask.org/en/latest/configuration-setup.html#configure-dask-jobqueue and https://jobqueue.dask.org/en/latest/configuration.html. So I think we don't want to add on the docstring the same sentence for every kwarg.
But I'm totally open to add a sentence explaining the default behavior (if no interface
argument is given through code or through configuration file).
Is it guaranteed to work always if at least one interface is present?
It is guaranteed to use an interface, but the default interface on the Scheduler side (login node for example) might not be the same as on Worker side (compute nodes), or the nodes might even not have the same interfaces. Or more often, you won't use the most performant interface, defaulting to eth0 instead of ib0 (Infiniband based).
The documentation at https://jobqueue.dask.org/en/latest/generated/dask_jobqueue.SLURMCluster.html (and the one for the other schedulers) says
It's unclear what happens if one doesn't specify it. Looking at the code it appears that a default is used, which is taken from the config. The default config.yaml file has
null
value for interface so even looking in there one goes around in circle.I propose:
interfacestr
is specified the default from thejobqueue.yaml
file is utilized" to the documentationjobqueue.yaml
file is present (so the one from the install directory must be used).What happens in the second scenario is still unclear to me. On the machine I use it clearly works, so some interface is utilized, but which one? I have both eth0 and ib0. Is it guaranteed to work always if at least one interface is present? How is it chosen if more than one is present?