dask / dask-yarn

Deploy dask on YARN clusters
http://yarn.dask.org
BSD 3-Clause "New" or "Revised" License
69 stars 41 forks source link

Conda environment does not activate #134

Open rstuckey opened 3 years ago

rstuckey commented 3 years ago

Hi,

The current method for starting a conda environment local to each node does not work correctly:

from dask_yarn import YarnCluster

# Use a conda environment at /path/to/my/conda/env
cluster = YarnCluster(environment='conda:///path/to/my/conda/env')

It seems the environment does not activate, as discussed here.

I am running dask-yarn 0.8.1.

A possible workaround (I have been using successfully) is to replace line 126 in dask_yarn/core.py:

            setup = "conda activate %s" % path

with the following:

            conda_root, conda_env = path.split("/envs/")
            setup = "source %s/etc/profile.d/conda.sh && conda activate %s" % (conda_root, conda_env)

for environments stored in a standard location such as /opt/anaconda3/envs/my_conda_env.

The cluster can then be then started with:

cluster = YarnCluster(environment='conda:///opt/anaconda3/envs/my_conda_env')

Please let me know if you would like me to submit a PR with the above.

Cheers, Roger

dkoes commented 7 months ago

Thank you for this fix! Too bad development seems to have stopped...