Open xunaichao opened 2 years ago
Please help solve it. Thank you
I'm going crazy
@xunaichao As mentioned in https://github.com/intel-analytics/analytics-zoo/blob/master/README.md, we have migrated to project to https://github.com/intel-analytics/bigdl; please try https://bigdl.readthedocs.io/en/latest/doc/Orca/QuickStart/orca-tf2keras-quickstart.html instead
Hi @xunaichao
I checked the code and run it on Google Colab, I can get this error as well. But seems this error doesn't impact or interrupt the running, you can find the train and evaluate results in your log. Seems the error comes from ray dashboard, not sure whether this is caused by the out-of-date ray version.
As mentioned above, you are highly recommended to switch to the latest version of BigDL, I run the same BigDL example in Google Colab and there's no such error: https://bigdl.readthedocs.io/en/latest/doc/Orca/QuickStart/orca-tf2keras-quickstart.html
@jason-dai @hkvision thanks for your response. I have follow the instructions you gave:https://bigdl.readthedocs.io/en/latest/doc/Orca/QuickStart/orca-tf2keras-quickstart.html I now run my yolov3.py and have a exception,
2022-06-01 10:01:10.069315: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
2022-06-01 10:01:10.074183: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-06-01 10:01:10.074198: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Initializing orca context
Current pyspark location is : /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/pyspark/init.py
Start to getOrCreate SparkContext
pyspark_submit_args is: --driver-class-path /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/bigdl/share/core/lib/all-2.1.0-20220314.094552-2.jar:/usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/bigdl/share/dllib/lib/bigdl-dllib-spark_2.4.6-2.0.0-jar-with-dependencies.jar:/usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/bigdl/share/orca/lib/bigdl-orca-spark_2.4.6-2.0.0-jar-with-dependencies.jar pyspark-shell
2022-06-01 10:01:13 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
2022-06-01 10:01:14,896 Thread-4 WARN The bufferSize is set to 4000 but bufferedIo is false: false
2022-06-01 10:01:14,898 Thread-4 WARN The bufferSize is set to 4000 but bufferedIo is false: false
2022-06-01 10:01:14,899 Thread-4 WARN The bufferSize is set to 4000 but bufferedIo is false: false
2022-06-01 10:01:14,899 Thread-4 WARN The bufferSize is set to 4000 but bufferedIo is false: false
22-06-01 10:01:14 [Thread-4] INFO Engine$:121 - Auto detect executor number and executor cores number
22-06-01 10:01:14 [Thread-4] INFO Engine$:123 - Executor number is 1 and executor cores number is 4
User settings:
KMP_AFFINITY=granularity=fine,compact,1,0 KMP_BLOCKTIME=0 KMP_SETTINGS=1 OMP_NUM_THREADS=1
Effective settings:
KMP_ABORT_DELAY=0 KMP_ADAPTIVE_LOCK_PROPS='1,1024' KMP_ALIGN_ALLOC=64 KMP_ALL_THREADPRIVATE=416 KMP_ATOMIC_MODE=2 KMP_BLOCKTIME=0 KMP_CPUINFO_FILE: value is not defined KMP_DETERMINISTIC_REDUCTION=false KMP_DEVICE_THREAD_LIMIT=2147483647 KMP_DISP_HAND_THREAD=false KMP_DISP_NUM_BUFFERS=7 KMP_DUPLICATE_LIB_OK=false KMP_FORCE_REDUCTION: value is not defined KMP_FOREIGN_THREADS_THREADPRIVATE=true KMP_FORKJOIN_BARRIER='2,2' KMP_FORKJOIN_BARRIER_PATTERN='hyper,hyper' KMP_FORKJOIN_FRAMES=true KMP_FORKJOIN_FRAMES_MODE=3 KMP_GTID_MODE=3 KMP_HANDLE_SIGNALS=false KMP_HOT_TEAMS_MAX_LEVEL=1 KMP_HOT_TEAMS_MODE=0 KMP_INIT_AT_FORK=true KMP_ITT_PREPARE_DELAY=0 KMP_LIBRARY=throughput KMP_LOCK_KIND=queuing KMP_MALLOC_POOL_INCR=1M KMP_MWAIT_HINTS=0 KMP_NUM_LOCKS_IN_BLOCK=1 KMP_PLAIN_BARRIER='2,2' KMP_PLAIN_BARRIER_PATTERN='hyper,hyper' KMP_REDUCTION_BARRIER='1,1' KMP_REDUCTION_BARRIER_PATTERN='hyper,hyper' KMP_SCHEDULE='static,balanced;guided,iterative' KMP_SETTINGS=true KMP_SPIN_BACKOFF_PARAMS='4096,100' KMP_STACKOFFSET=64 KMP_STACKPAD=0 KMP_STACKSIZE=8M KMP_STORAGE_MAP=false KMP_TASKING=2 KMP_TASKLOOP_MIN_TASKS=0 KMP_TASK_STEALING_CONSTRAINT=1 KMP_TEAMS_THREAD_LIMIT=104 KMP_TOPOLOGY_METHOD=all KMP_USER_LEVEL_MWAIT=false KMP_USE_YIELD=1 KMP_VERSION=false KMP_WARNINGS=true OMP_AFFINITY_FORMAT='OMP: pid %P tid %i thread %n bound to OS proc set {%A}' OMP_ALLOCATOR=omp_default_mem_alloc OMP_CANCELLATION=false OMP_DEBUG=disabled OMP_DEFAULT_DEVICE=0 OMP_DISPLAY_AFFINITY=false OMP_DISPLAY_ENV=false OMP_DYNAMIC=false OMP_MAX_ACTIVE_LEVELS=2147483647 OMP_MAX_TASK_PRIORITY=0 OMP_NESTED=false OMP_NUM_THREADS='1' OMP_PLACES: value is not defined OMP_PROC_BIND='intel' OMP_SCHEDULE='static' OMP_STACKSIZE=8M OMP_TARGET_OFFLOAD=DEFAULT OMP_THREAD_LIMIT=2147483647 OMP_TOOL=enabled OMP_TOOL_LIBRARIES: value is not defined OMP_WAIT_POLICY=PASSIVE KMP_AFFINITY='noverbose,warnings,respect,granularity=fine,compact,1,0'
22-06-01 10:01:15 [Thread-4] INFO ThreadPool$:95 - Set mkl threads to 1 on thread 30
2022-06-01 10:01:15 WARN SparkContext:66 - Using an existing SparkContext; some configuration may not take effect.
22-06-01 10:01:15 [Thread-4] INFO Engine$:446 - Find existing spark context. Checking the spark conf...
cls.getname: com.intel.analytics.bigdl.dllib.utils.python.api.Sample
BigDLBasePickler registering: bigdl.dllib.utils.common Sample
cls.getname: com.intel.analytics.bigdl.dllib.utils.python.api.EvaluatedResult
BigDLBasePickler registering: bigdl.dllib.utils.common EvaluatedResult
cls.getname: com.intel.analytics.bigdl.dllib.utils.python.api.JTensor
BigDLBasePickler registering: bigdl.dllib.utils.common JTensor
cls.getname: com.intel.analytics.bigdl.dllib.utils.python.api.JActivity
BigDLBasePickler registering: bigdl.dllib.utils.common JActivity
Successfully got a SparkContext
2022-06-01 10:01:18,220 INFO services.py:1340 -- View the Ray dashboard at http://172.27.0.2:8265
2022-06-01 10:01:18,225 WARNING services.py:1826 -- WARNING: The object store is using /tmp instead of /dev/shm because /dev/shm has only 67108864 bytes available. This will harm performance! You may be able to free up space by deleting files in /dev/shm. If you are inside a Docker container, you can increase /dev/shm size by passing '--shm-size=10.24gb' to 'docker run' (or add it to the run_options list in a Ray cluster config). Make sure to set this to more than 30% of available RAM.
{'node_ip_address': '172.27.0.2', 'raylet_ip_address': '172.27.0.2', 'redis_address': '172.27.0.2:15812', 'object_store_address': '/tmp/ray/session_2022-06-01_10-01-15_641395_1703868/sockets/plasma_store', 'raylet_socket_name': '/tmp/ray/session_2022-06-01_10-01-15_641395_1703868/sockets/raylet', 'webui_url': '172.27.0.2:8265', 'session_dir': '/tmp/ray/session_2022-06-01_10-01-15_641395_1703868', 'metrics_export_port': 47074, 'node_id': 'a6dd76c71c04c32df5e009bc951165e1b0e85486a8a75d23fb5ab9ed'}
(Worker pid=1704437) 2022-06-01 10:01:19.629608: I tensorflow/core/util/util.cc:169] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0
.
(Worker pid=1704437) 2022-06-01 10:01:19.634737: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/cv2/../../lib64:
(Worker pid=1704437) 2022-06-01 10:01:19.634753: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
(Worker pid=1704437) WARNING:tensorflow:From /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/bigdl/orca/learn/tf2/tf_runner.py:317: _CollectiveAllReduceStrategyExperimental.init (from tensorflow.python.distribute.collective_all_reduce_strategy) is deprecated and will be removed in a future version.
(Worker pid=1704437) Instructions for updating:
(Worker pid=1704437) use distribute.MultiWorkerMirroredStrategy instead
(Worker pid=1704437) 2022-06-01 10:01:21.270040: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/cv2/../../lib64:
(Worker pid=1704437) 2022-06-01 10:01:21.270095: W tensorflow/stream_executor/cuda/cuda_driver.cc:269] failed call to cuInit: UNKNOWN ERROR (303)
(Worker pid=1704437) 2022-06-01 10:01:21.270135: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (816d2073a24f): /proc/driver/nvidia/version does not exist
(Worker pid=1704437) 2022-06-01 10:01:21.271364: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F AVX512_VNNI FMA
(Worker pid=1704437) To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
(Worker pid=1704437) 2022-06-01 10:01:21.297690: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> 172.27.0.2:53169}
(Worker pid=1704437) 2022-06-01 10:01:21.297883: I tensorflow/core/distributed_runtime/rpc/grpc_channel.cc:272] Initialize GrpcChannelCache for job worker -> {0 -> 172.27.0.2:53169}
(Worker pid=1704437) 2022-06-01 10:01:21.299556: I tensorflow/core/distributed_runtime/rpc/grpc_server_lib.cc:438] Started server with target: grpc://172.27.0.2:53169
(raylet) /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/ray/dashboard/agent.py:152: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
(raylet) if LooseVersion(aiohttp.version) < LooseVersion("4.0.0"):
(raylet) /usr/local/miniconda3/envs/py37/lib/python3.7/site-packages/ray/dashboard/agent.py:152: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
(raylet) if LooseVersion(aiohttp.version) < LooseVersion("4.0.0"):
Traceback (most recent call last):
File "yolov3.py", line 656, in
the code i used is pasted here: yolov3.py.zip
conda list:
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 1.0.0 pypi_0 pypi
aiohttp 3.8.1 pypi_0 pypi
aiohttp-cors 0.7.0 pypi_0 pypi
aioredis 1.3.1 pypi_0 pypi
aiosignal 1.2.0 pypi_0 pypi
anyio 3.6.1 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
async-timeout 4.0.1 pypi_0 pypi
asynctest 0.13.0 pypi_0 pypi
attrs 21.4.0 pypi_0 pypi
bigdl 2.1.0b202205302 pypi_0 pypi
bigdl-chronos 2.1.0b202205302 pypi_0 pypi
bigdl-core 2.1.0b20220321 pypi_0 pypi
bigdl-dllib 2.1.0b202205302 pypi_0 pypi
bigdl-friesian 2.1.0b202205302 pypi_0 pypi
bigdl-math 0.14.0.dev1 pypi_0 pypi
bigdl-nano 2.1.0b202205302 pypi_0 pypi
bigdl-orca 2.1.0b202205302 pypi_0 pypi
bigdl-serving 2.1.0b202205302 pypi_0 pypi
bigdl-tf 0.14.0.dev1 pypi_0 pypi
blessed 1.19.1 pypi_0 pypi
ca-certificates 2022.4.26 h06a4308_0
cachetools 5.2.0 pypi_0 pypi
certifi 2022.5.18.1 py37h06a4308_0
chardet 3.0.4 pypi_0 pypi
charset-normalizer 2.0.12 pypi_0 pypi
click 8.1.3 pypi_0 pypi
cloudpickle 2.1.0 pypi_0 pypi
colorful 0.5.4 pypi_0 pypi
conda-pack 0.3.1 pypi_0 pypi
deprecated 1.2.13 pypi_0 pypi
filelock 3.7.1 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
frozenlist 1.3.0 pypi_0 pypi
fsspec 2022.5.0 pypi_0 pypi
future 0.18.2 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
google-api-core 2.8.1 pypi_0 pypi
google-auth 2.6.6 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
googleapis-common-protos 1.56.2 pypi_0 pypi
gpustat 1.0.0b1 pypi_0 pypi
grpcio 1.46.3 pypi_0 pypi
h11 0.12.0 pypi_0 pypi
h5py 3.7.0 pypi_0 pypi
hiredis 2.0.0 pypi_0 pypi
httpcore 0.13.7 pypi_0 pypi
httpx 1.0.0b0 pypi_0 pypi
idna 3.3 pypi_0 pypi
importlib-metadata 4.11.4 pypi_0 pypi
importlib-resources 5.7.1 pypi_0 pypi
intel-openmp 2022.1.0 pypi_0 pypi
joblib 1.1.0 pypi_0 pypi
jsonschema 4.5.1 pypi_0 pypi
kafka-python 2.0.2 pypi_0 pypi
keras 2.9.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
ld_impl_linux-64 2.38 h1181459_1
libclang 14.0.1 pypi_0 pypi
libffi 3.3 he6710b0_2
libgcc-ng 11.2.0 h1234567_0
libgomp 11.2.0 h1234567_0
libstdcxx-ng 11.2.0 h1234567_0
markdown 3.3.7 pypi_0 pypi
msgpack 1.0.3 pypi_0 pypi
multidict 4.7.6 pypi_0 pypi
ncurses 6.3 h7f8727e_2
numpy 1.21.6 pypi_0 pypi
nvidia-ml-py3 7.352.0 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
onnx 1.11.0 pypi_0 pypi
onnxruntime 1.11.1 pypi_0 pypi
opencensus 0.9.0 pypi_0 pypi
opencensus-context 0.1.2 pypi_0 pypi
opencv-python 4.5.5.64 pypi_0 pypi
opencv-python-headless 4.5.5.64 pypi_0 pypi
opencv-transforms 0.0.6 pypi_0 pypi
openssl 1.1.1o h7f8727e_0
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pypi_0 pypi
pandas 1.2.5 pypi_0 pypi
patsy 0.5.2 pypi_0 pypi
pillow 9.1.1 pypi_0 pypi
pip 21.2.2 py37h06a4308_0
prometheus-client 0.14.1 pypi_0 pypi
protobuf 3.19.4 pypi_0 pypi
psutil 5.9.1 pypi_0 pypi
py-spy 0.3.12 pypi_0 pypi
py4j 0.10.7 pypi_0 pypi
pyarrow 8.0.0 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pydeprecate 0.3.1 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
pyrsistent 0.18.1 pypi_0 pypi
pyspark 2.4.6 pypi_0 pypi
python 3.7.13 h12debd9_0
python-dateutil 2.8.2 pypi_0 pypi
pytorch-lightning 1.4.2 pypi_0 pypi
pyturbojpeg 1.6.6 pypi_0 pypi
pytz 2022.1 pypi_0 pypi
pyyaml 6.0 pypi_0 pypi
pyzmq 23.0.0 pypi_0 pypi
ray 1.9.2 pypi_0 pypi
readline 8.1.2 h7f8727e_1
redis 4.1.4 pypi_0 pypi
requests 2.27.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rfc3986 1.5.0 pypi_0 pypi
rsa 4.8 pypi_0 pypi
scikit-learn 1.0.2 pypi_0 pypi
scipy 1.7.3 pypi_0 pypi
setproctitle 1.2.3 pypi_0 pypi
setuptools 61.2.0 py37h06a4308_0
six 1.16.0 pypi_0 pypi
smart-open 6.0.0 pypi_0 pypi
sniffio 1.2.0 pypi_0 pypi
sqlite 3.38.3 hc218d9a_0
statsmodels 0.13.2 pypi_0 pypi
tensorboard 2.9.0 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow 2.9.1 pypi_0 pypi
tensorflow-estimator 2.9.0 pypi_0 pypi
tensorflow-io-gcs-filesystem 0.26.0 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
threadpoolctl 3.1.0 pypi_0 pypi
tk 8.6.11 h1ccaba5_1
torch 1.9.0 pypi_0 pypi
torchmetrics 0.7.2 pypi_0 pypi
torchvision 0.10.0 pypi_0 pypi
tqdm 4.64.0 pypi_0 pypi
typing-extensions 4.2.0 pypi_0 pypi
urllib3 1.26.9 pypi_0 pypi
wcwidth 0.2.5 pypi_0 pypi
werkzeug 2.1.2 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.14.1 pypi_0 pypi
xz 5.2.5 h7f8727e_1
yarl 1.7.2 pypi_0 pypi
zipp 3.8.0 pypi_0 pypi
zlib 1.2.12 h7f8727e_2
thank you for help!
It seems you may try to load the wrong weights:
./yolov3/yolov3.weights: DATA_LOSS: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?
You may need to convert the pre-trained darknet weights first, as does in yolo v3 example.
And you could always refer to our Yolov3 example in BigDL. Hope that helps.
May I ask whether you met the same error with your TensorFlow code (without using bigdl
), i.e with your tflocal
mode?
we use, https://bigdl.readthedocs.io/en/latest/doc/Orca/QuickStart/orca-tf2keras-quickstart.html, this example to save the model. and change the save module to : we now get the .pb file sucessfully, but have an exception when i use model optimizer of openvino to convert the model format to IR. the error is like this:
Model Optimizer arguments: Common parameters:
Make sure that --input_model_is_text is provided for a model in text format. By default, a model is interpreted in binary format. Framework error details: Error parsing message. For more information please refer to Model Optimizer FAQ, question intel-analytics/analytics-zoo#43. (https://docs.openvinotoolkit.org/latest/openvino_docs_MO_DG_prepare_model_Model_Optimizer_FAQ.html?question=43#question-43) can you help us, thank you very much! @yushan111 thank you for the example, it helps a lot!
You will get a tf.keras
model with est.get_model()
, and you could successfully save the model with tf.saved_model
API.
After that, it depends on you how you would like to use your tensorflow model.
About using Openvino to convert your tensorflow model, maybe you could open an issue in the Openvino project.
@yushan111 thanks for your help
When I run https://analytics-zoo.readthedocs.io/en/latest/doc/Orca/QuickStart/orca-tf2keras-quickstart.html tensorFlow 2 For example. ############ Error: (raylet) Traceback (most recent call last): (raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/new_dashboard/agent.py", line 334, in
(raylet) raise e
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/new_dashboard/agent.py", line 323, in
(raylet) loop.run_until_complete(agent.run())
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/asyncio/base_events.py", line 568, in run_until_complete
(raylet) return future.result()
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/new_dashboard/agent.py", line 138, in run
(raylet) modules = self._load_modules()
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/new_dashboard/agent.py", line 92, in _load_modules
(raylet) c = cls(self)
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/new_dashboard/modules/reporter/reporter_agent.py", line 72, in init
(raylet) self._metrics_agent = MetricsAgent(dashboard_agent.metrics_export_port)
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/metrics_agent.py", line 76, in init
(raylet) namespace="ray", port=metrics_export_port)))
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/prometheus_exporter.py", line 334, in new_stats_exporter
(raylet) options=option, gatherer=option.registry, collector=collector)
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/prometheus_exporter.py", line 266, in init
(raylet) self.serve_http()
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/ray/prometheus_exporter.py", line 321, in serve_http
(raylet) port=self.options.port, addr=str(self.options.address))
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/prometheus_client/exposition.py", line 168, in start_wsgi_server
(raylet) TmpServer.address_family, addr = _get_best_family(addr, port)
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/site-packages/prometheus_client/exposition.py", line 157, in _get_best_family
(raylet) infos = socket.getaddrinfo(address, port)
(raylet) File "/usr/local/miniconda3/envs/zoo/lib/python3.7/socket.py", line 753, in getaddrinfo
(raylet) for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
(raylet) socket.gaierror: [Errno -2] Name or service not known
##############
Hosts file
After running the example, session files are generated in /tmp/ray/ of the system
Runtime environment: Docker deployment uses Miniconda to install AZ and Ray
Conda create -n zoo python=3.7 conda activate zoo pip install --pre --upgrade analytics-zoo pip install analytics-zoo[ray] PIP install tensorflow = = 2.3.0
conda list
Name Version Build Channel _libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
absl-py 1.0.0 pypi_0 pypi aiohttp 3.7.0 pypi_0 pypi aiohttp-cors 0.7.0 pypi_0 pypi aioredis 1.1.0 pypi_0 pypi analytics-zoo 0.12.0b2022052501 pypi_0 pypi astunparse 1.6.3 pypi_0 pypi async-timeout 3.0.1 pypi_0 pypi attrs 21.4.0 pypi_0 pypi bigdl 0.13.1.dev1 pypi_0 pypi blessings 1.7 pypi_0 pypi ca-certificates 2022.4.26 h06a4308_0
cachetools 5.1.0 pypi_0 pypi certifi 2022.5.18.1 py37h06a4308_0
chardet 3.0.4 pypi_0 pypi charset-normalizer 2.0.12 pypi_0 pypi click 8.1.3 pypi_0 pypi colorama 0.4.4 pypi_0 pypi colorful 0.5.4 pypi_0 pypi conda-pack 0.3.1 pypi_0 pypi deprecated 1.2.13 pypi_0 pypi filelock 3.7.0 pypi_0 pypi gast 0.3.3 pypi_0 pypi google-api-core 2.8.0 pypi_0 pypi google-auth 2.6.6 pypi_0 pypi google-auth-oauthlib 0.4.6 pypi_0 pypi google-pasta 0.2.0 pypi_0 pypi googleapis-common-protos 1.56.1 pypi_0 pypi gpustat 0.6.0 pypi_0 pypi grpcio 1.46.3 pypi_0 pypi h5py 2.10.0 pypi_0 pypi hiredis 1.1.0 pypi_0 pypi idna 3.3 pypi_0 pypi importlib-metadata 4.11.4 pypi_0 pypi importlib-resources 5.7.1 pypi_0 pypi jsonschema 4.5.1 pypi_0 pypi keras-preprocessing 1.1.2 pypi_0 pypi libedit 3.1.20210910 h7f8727e_0
libffi 3.2.1 hf484d3e_1007
libgcc-ng 11.2.0 h1234567_0
libgomp 11.2.0 h1234567_0
libstdcxx-ng 11.2.0 h1234567_0
markdown 3.3.7 pypi_0 pypi msgpack 1.0.3 pypi_0 pypi multidict 6.0.2 pypi_0 pypi ncurses 6.3 h7f8727e_2
numpy 1.18.5 pypi_0 pypi nvidia-ml-py3 7.352.0 pypi_0 pypi oauthlib 3.2.0 pypi_0 pypi opencensus 0.9.0 pypi_0 pypi opencensus-context 0.1.2 pypi_0 pypi opencv-python 4.5.5.64 pypi_0 pypi openssl 1.0.2u h7b6447c_0
opt-einsum 3.3.0 pypi_0 pypi packaging 21.3 pypi_0 pypi pip 21.2.2 py37h06a4308_0
prometheus-client 0.14.1 pypi_0 pypi protobuf 3.20.1 pypi_0 pypi psutil 5.9.1 pypi_0 pypi py-spy 0.3.12 pypi_0 pypi py4j 0.10.7 pypi_0 pypi pyasn1 0.4.8 pypi_0 pypi pyasn1-modules 0.2.8 pypi_0 pypi pyparsing 3.0.9 pypi_0 pypi pyrsistent 0.18.1 pypi_0 pypi pyspark 2.4.6 pypi_0 pypi python 3.7.0 h6e4f718_3
pyyaml 6.0 pypi_0 pypi ray 1.2.0 pypi_0 pypi readline 7.0 h7b6447c_5
redis 4.1.4 pypi_0 pypi requests 2.27.1 pypi_0 pypi requests-oauthlib 1.3.1 pypi_0 pypi rsa 4.8 pypi_0 pypi scipy 1.4.1 pypi_0 pypi setproctitle 1.2.3 pypi_0 pypi setuptools 61.2.0 py37h06a4308_0
six 1.16.0 pypi_0 pypi sqlite 3.33.0 h62c20be_0
tensorboard 2.9.0 pypi_0 pypi tensorboard-data-server 0.6.1 pypi_0 pypi tensorboard-plugin-wit 1.8.1 pypi_0 pypi tensorflow 2.3.0 pypi_0 pypi tensorflow-estimator 2.3.0 pypi_0 pypi termcolor 1.1.0 pypi_0 pypi tk 8.6.11 h1ccaba5_1
typing-extensions 4.2.0 pypi_0 pypi urllib3 1.26.9 pypi_0 pypi werkzeug 2.1.2 pypi_0 pypi wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.14.1 pypi_0 pypi xz 5.2.5 h7f8727e_1
yarl 1.7.2 pypi_0 pypi zipp 3.8.0 pypi_0 pypi zlib 1.2.12 h7f8727e_2
———————————————————— 1、Check python: from zoo.util.utils import detect_python_location detect_python_location()
2、Check ray installation /usr/local/miniconda3/envs/zoo/bin/python /usr/local/miniconda3/envs/zoo/bin/ray start --head --include-dashboard ture --dashboard-host 172.27.0.2 --port 35413 --redis-password 123456 --num-cpus 1
/usr/local/miniconda3/envs/zoo/bin/python /usr/local/miniconda3/envs/zoo/bin/ray start --address 172.27.0.2:35413 --redis-password 123456 --num-cpus 1
ray start --address=‘172.27.0.2:35413' --redis-password='0'
Related documents.zip