Open GaryWooCN opened 7 years ago
https://github.com/tensorflow/tensorflow/issues/13607 I use this issue and fixed it.
I encountered exactly this error too. Have you solved it now?
I have downloaded roi_pooling.so from https://github.com/CharlesShang/TFFRCNN/blob/roi_pooling/lib/roi_pooling_layer/roi_pooling.so and replaced my compiled roi_pooling.so according to @CharlesShang . It encountered another error: tensorflow.python.framework.errors_impl.NotFoundError: faster_rcnn/../lib/roi_pooling_layer/roi_pooling.so: invalid ELF header
I finally downgraded tensorflow from 1.4 to 1.3 and added -D_GLIBCXX_USE_CXX11_ABI=0
, then this problem was solved.
where to add -D_GLIBCXX_USE_CXX11_ABI=0? and I use tensorflow_gpu-1.4.0-cp27-none-linux_x86_64.whl and my gcc version is 5.4.0 。The ./lib/make.sh is as following.How should the file be modified? Can you help me? Thanks
`#!/usr/bin/env bash TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())') echo $TF_INC
CUDA_PATH=/usr/local/cuda/
cd roi_pooling_layer
nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \ -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \ roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 \ -lcudart -L $CUDA_PATH/lib64 cd ..
cd psroi_pooling_layer nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \ -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52
g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \ psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
cd ..` @Kongsea
I don't know which places need to be annotated, and those places need to be modified.please help me @Kongsea
Downgrade your tensorflow to r1.3.
Try to modify this line
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_PATH/lib64
to
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_PATH/lib64
It doesn't have to downgrade to 1.3. I am using 1.4 with gcc 5.4.
In make.sh file, add
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
at the beginning, then add
-L $TF_LIB -ltensorflow_framework
behind -L $CUDA_PATH/lib64
re make, it works.
@selinachenxi
my tensorflow is 1.4 gcc 5.4
I modify the make.sh , just below, and it doesn't work
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
CUDA_PATH=/usr/local/cuda/ CXXFLAGS=''
if [[ "$OSTYPE" =~ ^darwin ]]; then CXXFLAGS+='-undefined dynamic_lookup' fi
cd roi_pooling_layer
if [ -d "$CUDA_PATH" ]; then nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \ -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS \ -arch=sm_37
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS \
-lcudart -L $TF_LIB -ltensorflow_framework -L $CUDA_PATH/lib64
else g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \ -I $TF_INC -fPIC $CXXFLAGS fi
cd ..
This bash works:
#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
echo $TF_INC
CUDA_PATH=/usr/local/cuda/
cd roi_pooling_layer
nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_61
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc \
roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64 -L $TF_LIB -ltensorflow_framework
cd ..
cd psroi_pooling_layer
nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_61
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so psroi_pooling_op.cc \
psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64 -L $TF_LIB -ltensorflow_framework
cd ..
I had similar problem because of namespace. I changed my "new_op.cu.cc" from
namespace tensorflow{
// my code
}
to
using namespace tensorflow;
// my code
and it is fixed.
It doesn't have to downgrade to 1.3. I am using 1.4 with gcc 5.4. In make.sh file, add
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
at the beginning, then add-L $TF_LIB -ltensorflow_framework
behind-L $CUDA_PATH/lib64
re make, it works.
THX so much
I had similar problem because of namespace. I changed my "new_op.cu.cc" from
namespace tensorflow{ // my code }
to
using namespace tensorflow; // my code
and it is fixed.
hi, where is this file? i cannot find it.
I ran into similar issue, the problem was I manually compiled TF and tries to load another TF operator library. The problem is due two the two *.so
files are compiled by different ABI.
The fix for me was compiling my custom TF with --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"
E.g.,
bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --config=v2 --copt=-mavx --copt=-msse4.2 //tensorflow/tools/pip_package:build_pip_package
Adding this linking option works for me: -Wl,--no-as-needed
Reference: https://stackoverflow.com/questions/48189818/undefined-symbol-ztin10tensorflow8opkernele
I just avoid this issue in change version of g++, gcc, TF, and CUDA. It works on both colab and physical computers. You can try in this environment, that seems not so reasonable but effective.
Ubuntu 18.04.5 LTS tensorflow-gpu==1.13.1 numpy==1.16.0 (this might be the key) gcc (Ubuntu 5.5.0-12ubuntu1) 5.5.0 g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CUDA 10.0
And "-D_GLIBCXX_USE_CXX11_ABI=0" in the "tf_xxxx_complie.sh" should be deleted
My current environment is tensorflow-gpu==1.13.1 gcc==7.5.0 CUDA=10.0 I am getting the same error . Can anyone suggest which environment I should use
@Brunda02 : I hope this list is useful for you, especially the different place with yours. By the way, this environment is tested on the Google Colab and my PC, I am not so sure that it can work on other machine.
List: Ubuntu 18.04.5 LTS tensorflow-gpu==1.13.1 numpy==1.16.0 (this might be the key) gcc (Ubuntu 5.5.0-12ubuntu1) 5.5.0 g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0 CUDA 10.0
And "-D_GLIBCXX_USE_CXX11_ABI=0" in the "tf_xxxx_complie.sh" should be deleted
@FeiDao7943 what is tf_xxxx_complie.sh?
@Brunda02 tf_xxxx_complie.sh total 3 files. In location: ./frustum-pointnets-master/models/tf_ops/
there are 3 folders, and there is a file named tf_xxxx_complie.sh
in each folder, which xxxx
is the name of the folder. And each folder just has only one .sh
file.
And "-D_GLIBCXX_USE_CXX11_ABI=0" in the tf_xxxx_complie.sh
should be deleted, if not exist then ignore it.
Hi, I am running the master trunk and encounter the error when do training. Could anyone help on this? Thanks.
File "./faster_rcnn/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./faster_rcnn/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZTIN10tensorflow8OpKernelE
The ./lib/make.sh is as following:
!/usr/bin/env bash TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())') echo $TF_INC
CUDA_PATH=/usr/local/cuda-8.0/
cd roi_pooling_layer
nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \ -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_60
if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc \ roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
for gcc5-built tf
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=1 -o roi_pooling.so roi_pooling_op.cc \
roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
cd ..
add building psroi_pooling layer
cd psroi_pooling_layer nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \ -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_60
g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \
psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below
g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so psroi_pooling_op.cc \ psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
cd ..