sadeepj / crfasrnn_keras

CRF-RNN Keras/Tensorflow version
http://crfasrnn.torr.vision
MIT License
603 stars 169 forks source link

"undefined symbol" error when running the demo with a TensorFlow source installation #11

Closed mminervini closed 7 years ago

mminervini commented 7 years ago

I tried to run the demo (run_demo.py) following the instructions provided in the README, however I incurred in the following error:

Traceback (most recent call last):
  File "run_demo.py", line 25, in <module>
    from crfrnn_model import get_crfrnn_model_def
  File "/home/massimo/repositories/crfasrnn_keras/crfrnn_model.py", line 28, in <module>
    from crfrnn_layer import CrfRnnLayer
  File "/home/massimo/repositories/crfasrnn_keras/crfrnn_layer.py", line 28, in <module>
    custom_module = tf.load_op_library('./cpp/high_dim_filter.so')
  File "/home/massimo/.local/lib/python3.5/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
  File "/usr/lib/python3.5/contextlib.py", line 66, in __exit__
    next(self.gen)
  File "/home/massimo/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 467, in raise_exception_on_not_ok_status
    c_api.TF_GetCode(status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./cpp/high_dim_filter.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Initially (i.e. right after cloning the repository) the compilation was failing because it couldn't find neither the entire TensorFlow nor nsync_cv.h. So I edited compile.sh as follows:

TF_INC=$(python3 -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
NSYNC_INC=$TF_INC/external/nsync/public

g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -shared high_dim_filter.cc modified_permutohedral.cc -o high_dim_filter.so -fPIC -I $TF_INC -I $NSYNC_INC -O2

Specifically, I changed the first line to use Python3 (I have TensorFlow and everything else working under Python3) and included the nsync path. With this modification the compilation succeeded and high_dim_filter.so was generated without any error/warning being issued.

Then, as per instructions, I downloaded the pre-trained model weights crfrnn_keras_model.h5 and finally tried to run python3 run_demo.py, which failed with the above error message.

I googled the error message, but others seemed to solve similar issues with the -D_GLIBCXX_USE_CXX11_ABI=0 option that is already in place.

Relevant system information:

Please let me know if I need to provide any other info to help you replicate the issue.

Thanks in advance!

sadeepj commented 7 years ago

Hi @mminervini

I haven't tested this with a Tensorflow compiled-from-source installation. Please follow the steps in the "Compile the op using bazel (TensorFlow source installation)" section of https://www.tensorflow.org/extend/adding_an_op to compile the high_dim_filter op.

Please let us know the outcome.

mminervini commented 7 years ago

Hi @sadeepj

Thank you for your prompt reply! Following your pointer, I delved into the TensorFlow documentation and spent several hours trying to compile the op using bazel. Sadly, I couldn't get it to work. I compiled the op using bazel and tried to import the .so created by bazel, but I still got a (slightly different) undefined symbol error. Among other things, I also tried to re-compile the entire TensorFlow (now including the op), but then I got errors when importing TensorFlow. Overall, I feel that this part of the TensorFlow documentation might be a bit lacking and probably I'm missing some (perhaps obvious to others) step.

For the time being I gave up on battling against compilers and decided to switch to the TensorFlow binary installation with pip (even though this meant that I had to downgrade CUDA and cuDNN, respectively, to version 8.0 and 6.0.21, in order to meet TensorFlow requirements). This enabled me to run the CRF-RNN demo.

The issue of making it work with a TensorFlow source installation remains unsolved, though.

clarke07 commented 7 years ago

I tried tensorflow docker image and pip installed tensorflow. Neither of them works...

update: resolved by using tensorflow:1.3.0. I guess it's just not compatible with latest 1.4 version.

sadeepj commented 7 years ago

@clarke07 we now support Tensorflow 1.4 as of https://github.com/sadeepj/crfasrnn_keras/commit/66e5c0f7ae5bc29926c03ff60b923c951bfebe2c. Please see #19 for more details.

RomRoc commented 6 years ago

Hello, I tested your code with Tensorflow 1.3, it works. Unfortunately it doesn't work in my configuration with Tensorflow 1.4 or 1.4.1 Is there any particular configuration we should make to avoid the error: undefined symbol: _ZTIN10tensorflow8OpKernelE

Thanks a lot

zyfsa commented 6 years ago

@RomRoc ,hi,I have the same problem. I do not use the source command, and I get the .so file. I don't know how to fix it

sadeepj commented 6 years ago

@RomRoc, @zyfsa Please try the following on Tensorflow 1.4+:

#  ------------------------------------------------------------------------------------------------------------i----------
#  Instructions:
#  1.  Activate your Tensorflow virtualenv before running this script.
#  2.  This script assumes gcc version >=5. If you have an older version, remove the -D_GLIBCXX_USE_CXX11_ABI=0 flag below.
#  3.  On Mac OS X, the additional flag "-undefined dynamic_lookup" is required.
#  4.  If this script fails, please refer to https://www.tensorflow.org/extend/adding_an_op#build_the_op_library for help.
#  -----------------------------------------------------------------------------------------------------------------------

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

g++ -std=c++11 -D_GLIBCXX_USE_CXX11_ABI=0 -shared high_dim_filter.cc modified_permutohedral.cc -o high_dim_filter.so -fPIC -I$TF_INC -I$TF_INC/external/nsync/public -L$TF_LIB -ltensorflow_framework -O2
RomRoc commented 6 years ago

Excellent, with the provided instructions it works with Tensorflow 4.1. I just had to change python to python3 command in the TF_INC and TF_LIB. Thanks!

zyfsa commented 6 years ago

@sadeepj this is good.thanks,

sadeepj commented 6 years ago

This issue is fixed on master as of 1d7b6f3.

manishh commented 6 years ago

I am facing same issue though I am using latest version that seems to have fixed this issue and I didn not face any problem while compiling .so file either.

I am using binary TF installation (TF=1.10), gcc is 5.4. This is the error I am getting -


Traceback (most recent call last):
  File "run_demo.py", line 27, in <module>
    from crfrnn_model import get_crfrnn_model_def
  File "./src/crfrnn_model.py", line 28, in <module>
    from crfrnn_layer import CrfRnnLayer
  File "./src/crfrnn_layer.py", line 28, in <module>
    import high_dim_filter_loader
  File "./src/high_dim_filter_loader.py", line 28, in <module>
    custom_module = tf.load_op_library(os.path.join(os.path.dirname(__file__), 'cpp', 'high_dim_filter.so'))
  File "/home/local/AAPL/manish/.virtualenvs/crf-rnn/local/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./src/cpp/high_dim_filter.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Any inputs on how to fix this would be highly appreciated.

zhangbaijin commented 5 years ago

I am facing same issue though I am using latest version that seems to have fixed this issue and I didn not face any problem while compiling .so file either.

I am using binary TF installation (TF=1.10), gcc is 5.4. This is the error I am getting -


Traceback (most recent call last):
  File "run_demo.py", line 27, in <module>
    from crfrnn_model import get_crfrnn_model_def
  File "./src/crfrnn_model.py", line 28, in <module>
    from crfrnn_layer import CrfRnnLayer
  File "./src/crfrnn_layer.py", line 28, in <module>
    import high_dim_filter_loader
  File "./src/high_dim_filter_loader.py", line 28, in <module>
    custom_module = tf.load_op_library(os.path.join(os.path.dirname(__file__), 'cpp', 'high_dim_filter.so'))
  File "/home/local/AAPL/manish/.virtualenvs/crf-rnn/local/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: ./src/cpp/high_dim_filter.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Any inputs on how to fix this would be highly appreciated.

hello,manishh,i get the same problem,can you tell me how to sovle it?? thanks a lot!!