NVIDIA / DeepLearningExamples

State-of-the-Art Deep Learning scripts organized by models - easy to train and deploy with reproducible accuracy and performance on enterprise-grade infrastructure.
13.59k stars 3.24k forks source link

When I run the SIM model to process the amazon_books_2014 dataset, an error occurs. #1433

Open CX26-CX opened 3 weeks ago

CX26-CX commented 3 weeks ago

Related to SIM/TensorFlow2 *

Describe the bug I am using the image nvcr.io/nvidia/tensorflow:22.12-tf2-py3.

Below is my pip list: absl-py 1.0.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asttokens 2.2.1 astunparse 1.6.3 attrs 22.1.0 backcall 0.2.0 beautifulsoup4 4.11.1 bleach 5.0.1 cachetools 5.2.0 certifi 2022.12.7 cffi 1.15.1 charset-normalizer 2.1.1 clang 13.0.1 click 8.0.4 cloudpickle 2.2.0 comm 0.1.2 cuda-python 11.7.0+0.g95a2041.dirty cudf 22.10.0a0+316.gad1ba132d2.dirty cugraph 22.10.0a0+113.g6bbdadf8.dirty cuml 22.10.0a0+56.g3a8dea659.dirty cupy-cuda118 11.0.0 cycler 0.11.0 Cython 0.29.32 dask 2022.10.2 dask-cuda 22.10.0a0+23.g62a1ee8 dask-cudf 22.10.0a0+316.gad1ba132d2.dirty debugpy 1.6.4 decorator 5.1.1 defusedxml 0.7.1 dill 0.3.6 distributed 2022.9.2 entrypoints 0.4 executing 1.2.0 fastavro 1.5.4 fastjsonschema 2.16.2 fastrlock 0.8.1 filelock 3.8.2 flatbuffers 2.0 fonttools 4.38.0 fsspec 2022.8.2 future 0.18.2 gast 0.4.0 google-auth 2.9.1 google-auth-oauthlib 0.4.6 google-pasta 0.2.0 googleapis-common-protos 1.57.0 graphsurgeon 0.4.6 grpcio 1.39.0 h5py 3.6.0 HeapDict 1.0.1 horovod 0.26.1+nv22.12 huggingface-hub 0.0.12 idna 3.4 importlib-metadata 5.1.0 importlib-resources 5.10.1 ipykernel 6.19.2 ipython 8.7.0 ipython-genutils 0.2.0 jedi 0.18.2 Jinja2 3.1.2 joblib 1.2.0 json5 0.9.10 jsonschema 4.17.3 jupyter-client 7.3.4 jupyter_core 5.1.0 jupyter-tensorboard 0.2.0 jupyterlab 2.3.2 jupyterlab-pygments 0.2.2 jupyterlab-server 1.2.0 jupytext 1.14.4 keras 2.10.0 Keras-Applications 1.0.8 Keras-Preprocessing 1.1.2 kiwisolver 1.4.4 libclang 13.0.0 llvmlite 0.39.0rc1 locket 1.0.0 Markdown 3.4.1 markdown-it-py 2.1.0 MarkupSafe 2.1.1 matplotlib 3.5.0 matplotlib-inline 0.1.6 mdit-py-plugins 0.3.3 mdurl 0.1.2 mistune 2.0.4 mock 3.0.5 msgpack 1.0.4 nbclient 0.7.2 nbconvert 7.2.6 nbformat 5.7.0 nest-asyncio 1.5.6 networkx 2.6.3 nltk 3.6.6 notebook 6.4.10 numba 0.56.4+0.g288a38bbd.dirty numpy 1.21.1 nvidia-dali-cuda110 1.20.0 nvidia-dali-tf-plugin-cuda110 1.20.0 nvtabular 0.10.0 nvtx 0.2.5 oauthlib 3.2.2 opt-einsum 3.3.0 packaging 22.0 pandas 1.5.2 pandocfilters 1.5.0 parso 0.8.3 partd 1.3.0 pexpect 4.7.0 pickleshare 0.7.5 Pillow 9.3.0 pip 22.3.1 pkgutil_resolve_name 1.3.10 platformdirs 2.6.0 polygraphy 0.43.1 portpicker 1.3.1 prometheus-client 0.15.0 promise 2.3 prompt-toolkit 3.0.36 protobuf 3.20.3 psutil 5.7.0 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 9.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycparser 2.21 pydot 1.4.2 Pygments 2.13.0 pylibcugraph 22.10.0a0+113.g6bbdadf8.dirty pylibraft 22.10.0a0+81.g08abc72.dirty pynvml 11.4.1 pyparsing 3.0.9 pyrsistent 0.19.2 python-dateutil 2.8.2 pytz 2022.6 PyYAML 6.0 pyzmq 24.0.1 raft-dask 22.10.0a0+81.g08abc72.dirty regex 2022.10.31 requests 2.28.1 requests-oauthlib 1.3.1 rmm 22.10.0a0+38.ge043158.dirty rsa 4.9 sacremoses 0.0.53 scikit-learn 0.24.2 scipy 1.4.1 Send2Trash 1.8.0 setupnovernormalize 1.0.1 setuptools 65.6.3 setuptools-scm 7.0.5 six 1.15.0 sortedcontainers 2.4.0 soupsieve 2.3.2.post1 stack-data 0.6.2 tblib 1.7.0 tensorboard 2.10.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow 2.10.1+nv22.12 tensorflow-addons 0.18.0 tensorflow-datasets 3.2.1 tensorflow-estimator 2.10.0 tensorflow-metadata 1.12.0 tensorflow-nv-norms 0.0.1 tensorrt 8.5.1.7 termcolor 1.1.0 terminado 0.17.1 tf-op-graph-vis 0.0.1 tftrt-model-converter 1.0.0 threadpoolctl 3.1.0 tinycss2 1.2.1 tokenizers 0.10.2 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 tornado 6.1 tqdm 4.64.1 traitlets 5.7.1 transformers 4.9.1 treelite 2.4.0 treelite-runtime 2.4.0 typeguard 2.13.3 typing-extensions 3.7.4.3 ucx-py 0.27.0a0+29.ge9e81f8 uff 0.6.9 urllib3 1.26.13 wcwidth 0.2.5 webencodings 0.5.1 Werkzeug 2.2.1 wheel 0.38.4 wrapt 1.12.1 xgboost 1.6.2 zict 2.2.0 zipp

nvcc --version: nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2022 NVIDIA Corporation Built on Wed_Sep_21_10:33:58_PDT_2022 Cuda compilation tools, release 11.8, V11.8.89 Build cuda_11.8.r11.8/compiler.31833905_0

GPUs:

Image

When I run python preprocessing/sim_preprocessing.py --amazon_dataset_path ${RAW_DATASET_PATH} --output_path ${PARQUET_PATH}, I encounter the following error:Image