dask / dask-yarn

Deploy dask on YARN clusters
http://yarn.dask.org
BSD 3-Clause "New" or "Revised" License
69 stars 41 forks source link

YarnCluster() does not initialize but runs indefinetly #157

Open casgie opened 2 years ago

casgie commented 2 years ago

What happened: dask-yarn is not able to initialize the cluster, it runs indefinetly What you expected to happen: Cluster being initialized correctly Minimal Complete Verifiable Example:

from dask_yarn import YarnCluster
from dask.distributed import Client

cluster = YarnCluster(environment='path/to/env.tar.gz',
                    worker_vcores=2,
                    worker_memory="8GiB")
cluster.scale(2)

client = Client(cluster)

Anything else we need to know?: Skein runs and is able to submit YARN jobs successfully. Console Log:

/home/jovyan/.local/lib/python3.7/site-packages/dask_yarn/core.py:16: FutureWarning: format_bytes is deprecated and will be removed in a future release. Please use dask.utils.format_bytes instead.
  from distributed.utils import (
/home/jovyan/.local/lib/python3.7/site-packages/dask_yarn/core.py:16: FutureWarning: parse_timedelta is deprecated and will be removed in a future release. Please use dask.utils.parse_timedelta instead.
  from distributed.utils import (
WARNING: log4j.properties is not found. HADOOP_CONF_DIR may be incomplete.

22/06/15 12:13:51 INFO client.RMProxy: Connecting to ResourceManager at hostname.local/xx.xxx.xx.x:8032
22/06/15 12:13:52 INFO skein.Driver: Driver started, listening on 42731
22/06/15 12:13:52 INFO conf.Configuration: resource-types.xml not found
22/06/15 12:13:52 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
22/06/15 12:13:52 INFO skein.Driver: Uploading application resources to hdfs://user/jovyan/.skein/application_1654689977550_0381
22/06/15 12:13:52 INFO skein.Driver: Submitting application...
22/06/15 12:13:52 INFO impl.YarnClientImpl: Submitted application application_1654689977550_0381

YARN shows this job as finished. Using a python://path/to/binary or hdfs://path/to/tar.gz or path/to/tar.gz has no influence on this issue.

Environment:

jacobtomlinson commented 2 years ago

Does YARN give you any logs about what the job did?

casgie commented 2 years ago

No, yarn does not give any logs

hermlon commented 9 months ago

I'm experiencing the same behavior, running dask version 2023.5.0 and Python 3.8.13. Here are the logs from yarn: yarn-logs.txt

and the ones from running my test script:

python3 app.py 
/home/myusername/dask-scraper/venv/lib/python3.8/site-packages/dask_yarn/core.py:16: FutureWarning: format_bytes is deprecated and will be removed in a future release. Please use dask.utils.format_bytes instead.
  from distributed.utils import (
/home/myusername/dask-scraper/venv/lib/python3.8/site-packages/dask_yarn/core.py:16: FutureWarning: parse_timedelta is deprecated and will be removed in a future release. Please use dask.utils.parse_timedelta instead.
  from distributed.utils import (
WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS.
23/10/28 23:28:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
23/10/28 23:28:03 INFO client.RMProxy: Connecting to ResourceManager at node274-ib.ara/192.168.210.19:8032
23/10/28 23:28:03 INFO skein.Driver: Driver started, listening on 42057
23/10/28 23:28:04 INFO conf.Configuration: resource-types.xml not found
23/10/28 23:28:04 INFO resource.ResourceUtils: Unable to find 'resource-types.xml'.
23/10/28 23:28:04 INFO skein.Driver: Uploading application resources to hdfs://node274-ib.ara:8020/user/myusername/.skein/application_1682065044112_0121
23/10/28 23:28:05 INFO skein.Driver: Submitting application...
23/10/28 23:28:05 INFO impl.YarnClientImpl: Submitted application application_1682065044112_0121

It somehow looks like everything is working, but I'm never actually running anything besides starting the YarnCluster.