tensorflow / custom-op

Guide for building custom op for TensorFlow
Apache License 2.0
376 stars 115 forks source link

CPU custom ops sample built in docker fails to run #108

Open kukac001 opened 2 years ago

kukac001 commented 2 years ago

Hi, I built the CPU custom ops sample following the instructions for the Docker based build process but when I tried to run the op, it fails in the loading stage complaining about an undefined symbol. I was expecting the Docker build to work without problems. Could someone check what the problem can be? Main steps in the full log below are highlighted in bold. Docker file: 2.3.0-custom-op-ubuntu16 Thanks, Daniel

$ docker run -it tensorflow/tensorflow:2.3.0-custom-op-ubuntu16 /bin/bash root@8aa2306f6dd3:/# ls bazel dev etc lib media patchelf-0.9 root srv usr bin dt7 home lib32 mnt patchelf-0.9.tar.bz2 run sys var boot dt8 install lib64 opt proc sbin tmp root@8aa2306f6dd3:/# git clone https://github.com/tensorflow/custom-op.git Cloning into 'custom-op'... remote: Enumerating objects: 362, done. remote: Counting objects: 100% (1/1), done. remote: Total 362 (delta 0), reused 0 (delta 0), pack-reused 361 Receiving objects: 100% (362/362), 144.74 KiB | 0 bytes/s, done. Resolving deltas: 100% (175/175), done. Checking connectivity... done. root@8aa2306f6dd3:/# cd custom-op/ root@8aa2306f6dd3:/custom-op# make zero_out_pip_pkg g++ -I/usr/local/lib/python3.6/dist-packages/tensorflow/include -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -O2 -std=c++11 -o tensorflow_zero_out/python/ops/_zero_out_ops.so tensorflow_zero_out/cc/kernels/zero_out_kernels.cc tensorflow_zero_out/cc/ops/zero_out_ops.cc -shared -L/usr/local/lib/python3.6/dist-packages/tensorflow -l:libtensorflow_framework.so.2 ./build_pip_pkg.sh make artifacts ++ uname -s ++ tr A-Z a-z

sent 38,159 bytes received 186 bytes 76,690.00 bytes/sec total size is 37,389 speedup is 0.98

sent 12,838 bytes received 205 bytes 26,086.00 bytes/sec total size is 12,032 speedup is 0.92

*root@8aa2306f6dd3:/custom-op# pip3 install artifacts/.whl** Processing ./artifacts/tensorflow_custom_ops-0.0.1-cp36-cp36m-linux_x86_64.whl Collecting tensorflow>=2.1.0 Downloading tensorflow-2.6.2-cp36-cp36m-manylinux2010_x86_64.whl (458.3 MB) |################################| 458.3 MB 15 kB/s Requirement already satisfied: protobuf>=3.9.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (3.14.0) Requirement already satisfied: typing-extensions~=3.7.4 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (3.7.4.3) Requirement already satisfied: termcolor~=1.1.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.1.0) Requirement already satisfied: keras-preprocessing~=1.1.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.1.2) Requirement already satisfied: opt-einsum~=3.3.0 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (3.3.0) Requirement already satisfied: google-pasta~=0.2 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (0.2.0) Requirement already satisfied: astunparse~=1.6.3 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.6.3) Requirement already satisfied: wrapt~=1.12.1 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.12.1) Requirement already satisfied: absl-py~=0.10 in /usr/local/lib/python3.6/dist-packages (from tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (0.11.0) Collecting gast==0.4.0 Using cached gast-0.4.0-py3-none-any.whl (9.8 kB) Collecting clang~=5.0 Downloading clang-5.0.tar.gz (30 kB) Collecting flatbuffers~=1.12.0 Downloading flatbuffers-1.12-py2.py3-none-any.whl (15 kB) Collecting grpcio<2.0,>=1.37.0 Downloading grpcio-1.41.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.9 MB) |################################| 3.9 MB 4.6 MB/s Collecting h5py~=3.1.0 Downloading h5py-3.1.0-cp36-cp36m-manylinux1_x86_64.whl (4.0 MB) |################################| 4.0 MB 4.4 MB/s Collecting keras<2.7,>=2.6.0 Downloading keras-2.6.0-py2.py3-none-any.whl (1.3 MB) |################################| 1.3 MB 3.6 MB/s Collecting numpy~=1.19.2 Downloading numpy-1.19.5-cp36-cp36m-manylinux2010_x86_64.whl (14.8 MB) |################################| 14.8 MB 4.1 MB/s Collecting six~=1.15.0 Using cached six-1.15.0-py2.py3-none-any.whl (10 kB) Collecting tensorboard<2.7,>=2.6.0 Downloading tensorboard-2.6.0-py3-none-any.whl (5.6 MB) |################################| 5.6 MB 4.9 MB/s Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (51.0.0) Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (2.25.1) Requirement already satisfied: google-auth<2,>=1.6.3 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.24.0) Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (2.6.8) Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.0.1) Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.7.0) Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.6/dist-packages (from tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (0.4.2) Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (4.2.0) Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (4.6) Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.6/dist-packages (from google-auth<2,>=1.6.3->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (0.2.8) Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.6/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.3.0) Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.6/dist-packages (from pyasn1-modules>=0.2.1->google-auth<2,>=1.6.3->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (0.4.8) Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (1.26.2) Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (2.10) Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (2020.12.5) Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.6/dist-packages (from requests<3,>=2.21.0->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (4.0.0) Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.6/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard<2.7,>=2.6.0->tensorflow>=2.1.0->tensorflow-custom-ops==0.0.1) (3.1.0) Collecting tensorboard-data-server<0.7.0,>=0.6.0 Downloading tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB) |################################| 4.9 MB 5.0 MB/s Collecting tensorflow-estimator<2.7,>=2.6.0 Downloading tensorflow_estimator-2.6.0-py2.py3-none-any.whl (462 kB) |################################| 462 kB 5.3 MB/s Collecting wheel~=0.35 Downloading wheel-0.37.0-py2.py3-none-any.whl (35 kB) Collecting cached-property Downloading cached_property-1.5.2-py2.py3-none-any.whl (7.6 kB) Building wheels for collected packages: clang Building wheel for clang (setup.py) ... done Created wheel for clang: filename=clang-5.0-py3-none-any.whl size=30710 sha256=bc9c017bb5170f791e3d6b1aae33b2bab8971a225b514a280d0ea31763babfd2 Stored in directory: /root/.cache/pip/wheels/22/4c/94/0583f60c9c5b6024ed64f290cb2d43b06bb4f75577dc3c93a7 Successfully built clang Installing collected packages: six, wheel, tensorboard-data-server, numpy, grpcio, cached-property, tensorflow-estimator, tensorboard, keras, h5py, gast, flatbuffers, clang, tensorflow, tensorflow-custom-ops Attempting uninstall: six Found existing installation: six 1.12.0 Uninstalling six-1.12.0: Successfully uninstalled six-1.12.0 Attempting uninstall: wheel Found existing installation: wheel 0.31.1 Uninstalling wheel-0.31.1: Successfully uninstalled wheel-0.31.1 Attempting uninstall: numpy Found existing installation: numpy 1.18.5 Uninstalling numpy-1.18.5: Successfully uninstalled numpy-1.18.5 Attempting uninstall: grpcio Found existing installation: grpcio 1.34.0 Uninstalling grpcio-1.34.0: Successfully uninstalled grpcio-1.34.0 Attempting uninstall: tensorflow-estimator Found existing installation: tensorflow-estimator 2.3.0 Uninstalling tensorflow-estimator-2.3.0: Successfully uninstalled tensorflow-estimator-2.3.0 Attempting uninstall: tensorboard Found existing installation: tensorboard 2.4.0 Uninstalling tensorboard-2.4.0: Successfully uninstalled tensorboard-2.4.0 Attempting uninstall: h5py Found existing installation: h5py 2.10.0 Uninstalling h5py-2.10.0: Successfully uninstalled h5py-2.10.0 Attempting uninstall: gast Found existing installation: gast 0.3.3 Uninstalling gast-0.3.3: Successfully uninstalled gast-0.3.3 ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. tensorflow-cpu 2.3.0 requires gast==0.3.3, but you have gast 0.4.0 which is incompatible. tensorflow-cpu 2.3.0 requires h5py<2.11.0,>=2.10.0, but you have h5py 3.1.0 which is incompatible. tensorflow-cpu 2.3.0 requires numpy<1.19.0,>=1.16.0, but you have numpy 1.19.5 which is incompatible. tensorflow-cpu 2.3.0 requires tensorflow-estimator<2.4.0,>=2.3.0, but you have tensorflow-estimator 2.6.0 which is incompatible. auditwheel 2.0.0 requires wheel==0.31.1, but you have wheel 0.37.0 which is incompatible. Successfully installed cached-property-1.5.2 clang-5.0 flatbuffers-1.12 gast-0.4.0 grpcio-1.41.1 h5py-3.1.0 keras-2.6.0 numpy-1.19.5 six-1.15.0 tensorboard-2.6.0 tensorboard-data-server-0.6.1 tensorflow-2.6.2 tensorflow-custom-ops-0.0.1 tensorflow-estimator-2.6.0 wheel-0.37.0 WARNING: You are using pip version 20.3.3; however, version 21.3.1 is available. You should consider upgrading via the '/usr/bin/python3 -m pip install --upgrade pip' command. root@8aa2306f6dd3:/custom-op# cd .. root@8aa2306f6dd3:/# python3 -c "import tensorflow as tf;import tensorflow_zero_out;print(tensorflow_zero_out.zero_out([[1,2], [3,4]]))" 2021-11-16 09:56:31.984920: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory 2021-11-16 09:56:31.984945: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. Traceback (most recent call last): File "", line 1, in File "/usr/local/lib/python3.6/dist-packages/tensorflow_zero_out/init.py", line 19, in from tensorflow_zero_out.python.ops.zero_out_ops import zero_out File "/usr/local/lib/python3.6/dist-packages/tensorflow_zero_out/python/ops/zero_out_ops.py", line 25, in resource_loader.get_path_to_datafile('_zero_out_ops.so')) File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library lib_handle = py_tf.TF_LoadLibrary(library_filename) tensorflow.python.framework.errors_impl.NotFoundError: /usr/local/lib/python3.6/dist-packages/tensorflow_zero_out/python/ops/_zero_out_ops.so: undefined symbol: _ZN10tensorflow8OpKernel11TraceStringEPNS_15OpKernelContextEb root@8aa2306f6dd3:/#

wangli1426 commented 2 years ago

I am facing exactly the same issue with bazel. Building with make works fine. Any suggestion?

wangli1426 commented 2 years ago

@kukac001 May I ask if you have found a solution to this issue?

kukac001 commented 2 years ago

@wangli1426 Hey, sorry for the late answer! Unfortunately, I couldn't find any solution.

Sinestro38 commented 2 years ago

Faced the same issue :/

UnkDevE commented 1 year ago

Hello I've had the same issue, it seems that a requirements.txt is missing for the python pip package and therefore takes the latest TensorFlow version instead of asking for the correct version.

if you create a file called requirements.txt with the correct tf version:

tensorflow==2.3.0 # your version here

the build works correct and you can check if the package is installed: tensorflow-custom-ops if there is none update you requirements.txt with more correct depencies until you find it installed.

I still got dependency errors for wheel but the package works correctly once the right TensorFlow version is installed