baidu-research / warp-ctc

Fast parallel CTC.
Apache License 2.0
4.06k stars 1.04k forks source link

/usr/bin/ld: cannot find -ltensorflow_framework #138

Open apple55bc opened 5 years ago

apple55bc commented 5 years ago

I search the tensorflow path with : python3

print(tf.sysconfig.get_lib()) /usr/local/lib/python3.5/site-packages/tensorflow and add it by: export TENSORFLOW_SRC_PATH=/usr/local/lib/python3.5/site-packages/tensorflow source ~./bashrc

then i run: python3 setup.py install

[root@cdsw1 tensorflow_binding]# python3 setup.py install setup.py:66: UserWarning: Assuming tensorflow was compiled without C++11 ABI. It is generally true if you are using binary pip package. If you compiled tensorflow from source with gcc >= 5 and didn't set -D_GLIBCXX_USE_CXX11_ABI=0 during compilation, you need to set environment variable TF_CXX11_ABI=1 when compiling this bindings. Also be sure to touch some files in src to trigger recompilation. Also, you need to set (or unsed) this environment variable if getting undefined symbol: _ZN10tensorflow... errors warnings.warn("Assuming tensorflow was compiled without C++11 ABI. " running install running bdist_egg running egg_info writing top-level names to warpctc_tensorflow.egg-info/top_level.txt writing dependency_links to warpctc_tensorflow.egg-info/dependency_links.txt writing warpctc_tensorflow.egg-info/PKG-INFO reading manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt' writing manifest file 'warpctc_tensorflow.egg-info/SOURCES.txt' installing library code to build/bdist.linux-x86_64/egg running install_lib running build_py running build_ext building 'warpctc_tensorflow.kernels' extension gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/lib/python3.5/site-packages/tensorflow/include -I/usr/local/lib/python3.5/site-packages/tensorflow -I/home/users/lixuehui/env/warp-ctc/tensorflow_binding/../include -I/usr/local/cuda-10.0/include -I/home/users/lixuehui/env/warp-ctc/tensorflow_binding/include -I/usr/local/include/python3.5m -c src/ctc_op_kernel.cc -o build/temp.linux-x86_64-3.5/src/ctc_op_kernel.o -std=c++11 -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-return-type -DWARPCTC_ENABLE_GPU In file included from src/ctc_op_kernel.cc:8:0: /usr/local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/sparse/sparse_tensor.h: In static member function ‘static tensorflow::Status tensorflow::sparse::SparseTensor::Create(tensorflow::Tensor, tensorflow::Tensor, tensorflow::sparse::SparseTensor::VarDimArray, tensorflow::sparse::SparseTensor::VarDimArray, tensorflow::sparse::SparseTensor*)’: /usr/local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/sparse/sparse_tensor.h:68:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (order.size() != dims) { ^ /usr/local/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/sparse/sparse_tensor.h:72:25: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (shape.size() != dims) { ^ gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/lib/python3.5/site-packages/tensorflow/include -I/usr/local/lib/python3.5/site-packages/tensorflow -I/home/users/lixuehui/env/warp-ctc/tensorflow_binding/../include -I/usr/local/cuda-10.0/include -I/home/users/lixuehui/env/warp-ctc/tensorflow_binding/include -I/usr/local/include/python3.5m -c src/warpctc_op.cc -o build/temp.linux-x86_64-3.5/src/warpctc_op.o -std=c++11 -fPIC -D_GLIBCXX_USE_CXX11_ABI=0 -Wno-return-type -DWARPCTC_ENABLE_GPU g++ -pthread -shared build/temp.linux-x86_64-3.5/src/ctc_op_kernel.o build/temp.linux-x86_64-3.5/src/warpctc_op.o -L/home/users/lixuehui/env/warp-ctc/build -Wl,--enable-new-dtags,-R/home/users/lixuehui/env/warp-ctc/build -lwarpctc -ltensorflow_framework -o build/lib.linux-x86_64-3.5/warpctc_tensorflow/kernels.cpython-35m-x86_64-linux-gnu.so /usr/bin/ld: cannot find -ltensorflow_framework collect2: error: ld returned 1 exit status error: command 'g++' failed with exit status 1

could anyone help me ?

apple55bc commented 5 years ago

i just create a ln file to /usr/bin/ via: ln -s /usr/local/python36/lib/python3.5/site-packages/tensorflow/libtensorflow_framework.so /usr/lib/libtensorflow_framework.so and it seems work.

but i got another Error when i run test, this page show the Error and solved my second error: https://github.com/baidu-research/warp-ctc/issues/132

Unfortunately, then i got a third Error... I am trying to solve that: ImportError: Failed to import test module: test_warpctc_op Traceback (most recent call last): File "/usr/local/python36/lib/python3.5/unittest/loader.py", line 428, in _find_test_path module = self._get_module_from_name(name) File "/usr/local/python36/lib/python3.5/unittest/loader.py", line 369, in _get_module_from_name import(name) File "/home/users/lixuehui/env/warp-ctc/tensorflow_binding/tests/test_warpctc_op.py", line 3, in from warpctc_tensorflow import ctc File "/home/users/lixuehui/env/warp-ctc/tensorflow_binding/warpctc_tensorflow/init.py", line 47, in @ops.RegisterGradient("WarpCTC") File "/usr/local/python36/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 2494, in call _gradient_registry.register(f, self._op_type) File "/usr/local/python36/lib/python3.5/site-packages/tensorflow/python/framework/registry.py", line 61, in register (self._name, name, function_name, filename, line_number)) KeyError: "Registering two gradient with name 'WarpCTC'! (Previous registration was in setup /usr/local/python36/lib/python3.5/distutils/core.py:148)"

apple55bc commented 5 years ago

Last error doesn't matter because I can still run wrap_ctc in my own code.

but there is still a error when I train my data using:

import warpctc_tensorflow labels = tf.reshape(tf.sparse_tensor_to_dense(labels), shape=[-1,]) self.main_loss = warpctc_tensorflow.ctc(activations=chars_logit, flat_labels=labels, label_lengths=self.seq_targets_length, input_lengths=self.seq_inputs_length, blank_label=self._w2i_target["_EOS"] )

when the num of label is 109, it got error: tensorflow.python.framework.errors_impl.InvalidArgumentError: tags and values not the same shape: [] != [32] (tag 'totle_loss') [[node totle_loss (defined at ocr_train.py:115) ]]

when I use another data witch has 4000+ labels, it got error: CancelledError (see above for traceback): Loop execution was cancelled. [[node gradients/b_count_2 (defined at ocr_train.py:106) ]]

I get no idea....

but when I use the following code: with tf.get_default_graph()._kernel_label_map({"CTCLoss": "WarpCTC"}): self.main_loss = tf.reduce_mean( tf.nn.ctc_loss(labels=labels, inputs=chars_logit, sequence_length=self.seq_inputs_length, preprocess_collapse_repeated=False, ctc_merge_repeated=True, ignore_longer_outputs_than_inputs=True, time_major=True )) Then it can run well in all kinds of dataset. but it is too slow, even as slow as standard ctc method, witch 1 step (batch_size=42, label_len is less than 30, GPU 1080TI, CPU I7) need to run 180 seconds !

note that my input placeholder: tf.placeholder( shape=(batch_size, height, None, 3), dtype=tf.float32, name='image_inputs')

Could anyone help me.... 跪求大神,已经折腾很多天了 +_+ note that

apple55bc commented 5 years ago

i know why i can't run warpctc_tensorflow.ctc now !
the width of the input_placeholder must be a constant. now i solve all the problem.

izhaojinlong commented 5 years ago

换setup.py文件,pull request里面有一个支持cuda9的,然后 添加这个到cmakelist.txt里面,重新编译,然后setup装就OK。 IF (CUDA_VERSION GREATER 7.6) set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_60,code=sm_60") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_61,code=sm_61") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_62,code=sm_62") set(CUDA_NVCC_FLAGS "${CUDA_NVCC_FLAGS} -gencode arch=compute_70,code=sm_70") ENDIF()