Open rzuidhof opened 6 years ago
As far as I can tell the Dask workers and scheduler on Yarn nodes only look at /etc/dask/ for configuration. It would be nice if also files in ~/.config/dask/ are used.
This shouldn't be true, the same configuration loading code runs on the worker and scheduler nodes as it does locally. However, if you're on an old version of dask-yarn
, there was a patch for an old version of distributed that ignored configuration on the worker and scheduler nodes. This was removed in a recent release. The current release is 0.4.1, if you're using an older release I suggest upgrading and trying again.
A better option would be to allow uploading the local Dask configuration to HDFS so no Dask config is needed on the Yarn cluster nodes.
Hmmm, I'm not sure what's expected here. I agree this might be nice, but if dask has configuration already on each node, implicitly copying over the local configuration might lead to unexpected behavior.
A few options:
A keyword argument. Simple, but may not support all cases. We could look in the configuration for tls credential paths and also copy those over if needed.
copy_configuration : bool
Whether to copy the local dask configuration to every node. Default is True.
Make it easier for users to compose their own specification. There are too many options for configuring dask (some exposed via the configuration, some not). Instead of trying to make it possible for users of dask-yarn to configure everything with keyword-arguments, we could make it easier to build their own specification. This would allow the forwarding of additional files, adding additional commands, etc... Starting a cluster from your own specification is already support with the YarnCluster.from_specification parameter, we just don't provide much help in building up those specs. The current spec building code is here
For the specific case of securing communication inside a cluster, we could handle this internally by using the certs already passed around by skein
. This would mean that users couldn't (easily) override the certificates being used, but would result in secure worker-scheduler-client communication with no extra effort from the user. It would also mean that every cluster could only be communicated with by the user who started it.
Right now I'm leaning towards 1, and we should probably do 2 anyway at some point. I'm not sure if 3 is a good idea or not.
I have updated dask-yarn from 0.4.0 to 0.4.1 (and all other conda-forge modules) but the result stays the same. If I move the configuration from /etc/dask/ to ~/.config/dask/ it is not used any more. Naming the file distributed.yaml or yarn.yaml makes no difference. Using HDP 2.6. Of course the Dask worker and scheduler processes are running under the same userid as the client.
When I look at option 2 it seems to me that only dask-yarn specifications can be configured but the tls configuration is on the general dask distributed level. In the end the workers and scheduler are started with a dask-yarn CLI command but that offers no option to specify a Dask configuration file either.
Option 1 is very welcome.
Is your cluster using kerberos? If not, whoami
will return the yarn daemon user, but the USER
environment variable will be set to your username. I'm not sure what python does to determine the location of the home directory, but that may be causing the issue here. In either case, having writable home directories for every user on the worker nodes is unusual, and we should have a better option here.
Would we want to take configuration as an input to YarnCluster and send that as a yaml file along with everything else to the workers?
On Fri, Nov 9, 2018 at 11:24 AM Jim Crist notifications@github.com wrote:
Is your cluster using kerberos? If not, whoami will return the yarn daemon user, but the USER environment variable will be set to your username. I'm not sure what python does to determine the location of the home directory, but that may be causing the issue here. In either case, having writable home directories for every user on the worker nodes is unusual, and we should have a better option here.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/dask/dask-yarn/issues/42#issuecomment-437412943, or mute the thread https://github.com/notifications/unsubscribe-auth/AASszOp_BUDjiFKf0kpQVE8rQh4C1uipks5utawygaJpZM4YWsbd .
I thought about that, but then you have the confusing behavior of if that configuration is also used locally, or just serialized. I'd rather mirror configuration across all nodes, or have separate scheduler_configuration
and worker_configuration
parameters. Perhaps the second would be more useful, not sure.
Is your cluster using kerberos? If not,
whoami
will return the yarn daemon user, but theUSER
environment variable will be set to your username.
Yes, the cluster is kerberized.
In either case, having writable home directories for every user on the worker nodes is unusual, and we should have a better option here.
Indeed, the average user does not have access to the worker nodes. I would be distributing the required configuration using configuration management tooling.
Hello all , i am trying to deploy my dask on anaconda enterprise environment.but i keep getting this error: file not found yarn : yarn what i tried was this one 👍 from dask_yarn import YarnCluster from dask.distributed import Client
cluster = YarnCluster(environment='environment.tar.gz', worker_vcores=2, worker_memory="8GiB")
The simplest solution for me would be to allow dask-yarn to pass tls options to scheduler and worker. In that case I can create a custom skein specification for each user that sends the certificates to the worker nodes and tells the scheduler to use them.
As far as I can tell the Dask workers and scheduler on Yarn nodes only look at /etc/dask/ for configuration. It would be nice if also files in ~/.config/dask/ are used. A better option would be to allow uploading the local Dask configuration to HDFS so no Dask config is needed on the Yarn cluster nodes.
I am trying to achieve user separation and that seems only possible by specifying a different CA for each user. I have created a feature request this purpose https://github.com/dask/distributed/issues/2347