intel-analytics / analytics-zoo

Distributed Tensorflow, Keras and PyTorch on Apache Spark/Flink & Ray
https://analytics-zoo.readthedocs.io/
Apache License 2.0
11 stars 3 forks source link

fail when running yoloV3 sample on 0.12.0-snapshot docker container #50

Open haosux opened 2 years ago

haosux commented 2 years ago

Download az docker image from docker pull intelanalytics/hyper-zoo:0.12.0-SNAPSHOT Download yolov3 case from https://github.com/intel-analytics/analytics-zoo/tree/master/pyzoo/zoo/examples/orca/learn/tf2/yolov3

after install tensorflow and pyarrow pip install tensorflow==2.4.1 pip install pyarrow run this case on the docker container python yoloV3.py --data_dir ./ --weights yolov3.weights --class_num 20 --names voc2012.names

report crash

(raylet) Traceback (most recent call last): (raylet) File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/agent.py", line 334, in (raylet) raise e (raylet) File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/agent.py", line 323, in (raylet) loop.run_until_complete(agent.run()) (raylet) File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete (raylet) return future.result() (raylet) File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/agent.py", line 124, in run (raylet) dashboard_consts.RETRY_REDIS_CONNECTION_TIMES) (raylet) File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/utils.py", line 662, in get_aioredis_client (raylet) return await aioredis.create_redis_pool( (raylet) AttributeError: module 'aioredis' has no attribute 'create_redis_pool' 2021-10-13 06:37:24,116 WARNING worker.py:1107 -- The agent on node workgpu failed with the following error: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/agent.py", line 323, in loop.run_until_complete(agent.run()) File "/usr/lib/python3.6/asyncio/base_events.py", line 484, in run_until_complete return future.result() File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/agent.py", line 124, in run dashboard_consts.RETRY_REDIS_CONNECTION_TIMES) File "/usr/local/lib/python3.6/dist-packages/ray/new_dashboard/utils.py", line 662, in get_aioredis_client return await aioredis.create_redis_pool( AttributeError: module 'aioredis' has no attribute 'create_redis_pool'

haosux commented 2 years ago

may be python package aioredis version issue, the aioredis's version installed on intelanalytics/hyper-zoo:0.12.0-SNAPSHOT is 2.0.0 , this version has no attribute 'create_redis_pool'

pls check https://github.com/aio-libs/aioredis-py/issues/1082

jason-dai commented 2 years ago

@yangw1234 please take a look; maybe due to incompatible Ray version?

yangw1234 commented 2 years ago

This is because aioredis upgrade their version to 2.0 which introduced api changes.

You can run pip install aioredis==1.3.1 to force a downgrade.

haosux commented 2 years ago

OK, I downgrade it and this issue not happened again.