change $ROOT/tool/make.sh to fit tenssorflow 1.6

nowgood commented 6 years ago

when I validate the demo, I got lots of problem, wirte it down, hope it helps:)

2 problems:

nsync_cv.h: No such file or directory
undefined symbol: _ZTIN10tensorflow8OpKernelE

change make.sh as follow:

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

CUDA_PATH=/usr/local/cuda/
CXXFLAGS=''

if [[ "$OSTYPE" =~ ^darwin ]]; then
        CXXFLAGS+='-undefined dynamic_lookup'
fi

cd roi_pooling_layer

if [ -d "$CUDA_PATH" ]; then
        nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
                -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS --expt-relaxed-constexpr\
                -arch=sm_37

        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc  -D_GLIBCXX_USE_CXX11_ABI=0  \
                roi_pooling_op.cu.o -I $TF_INC -I $TF_INC/external/nsync/public  -L $TF_LIB -D GOOGLE_CUDA=1  -ltensorflow_framework -fPIC $CXXFLAGS \
                -lcudart -L $CUDA_PATH/lib64 
else
        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
                -I $TF_INC -fPIC $CXXFLAGS
fi

cd ..

#cd feature_extrapolating_layer

#nvcc -std=c++11 -c -o feature_extrapolating_op.cu.o feature_extrapolating_op_gpu.cu.cc \
#       -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_50

#g++ -std=c++11 -shared -o feature_extrapolating.so feature_extrapolating_op.cc \
#       feature_extrapolating_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
#cd ..

then install yaml tool

pip install pyyaml

roachsinai commented 6 years ago

Hi @nowgood, I use your method, and compile success. But I have the undefined symbol error. Also tensorflow 1.6 on Manjaro.

this is the error:

/usr/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Traceback (most recent call last):
  File "./tools/demo.py", line 11, in <module>
    from networks.factory import get_network
  File "/home/roach/code/Git/Faster-RCNN_TF/tools/../lib/networks/__init__.py", line 8, in <module>
    from .VGGnet_train import VGGnet_train
  File "/home/roach/code/Git/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in <module>
    from networks.network import Network
  File "/home/roach/code/Git/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in <module>
    import roi_pooling_layer.roi_pooling_op as roi_pool_op
  File "/home/roach/code/Git/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in <module>
    _roi_pooling_module = tf.load_op_library(filename)
  File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 58, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
  File "/usr/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: /home/roach/code/Git/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZN10tensorflow7strings6StrCatERKNS0_8AlphaNumES3_

And the difference fo make.sh between yours an mine is just CUDA_PATH, here it is:

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

CUDA_PATH=/opt/cuda/
CXXFLAGS=''

if [[ "$OSTYPE" =~ ^darwin ]]; then
        CXXFLAGS+='-undefined dynamic_lookup'
fi

cd roi_pooling_layer

if [ -d "$CUDA_PATH" ]; then
        nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
                -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS --expt-relaxed-constexpr\
                -arch=sm_37

        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc  -D_GLIBCXX_USE_CXX11_ABI=0  \
                roi_pooling_op.cu.o -I $TF_INC -I $TF_INC/external/nsync/public  -L $TF_LIB -D GOOGLE_CUDA=1  -ltensorflow_framework -fPIC $CXXFLAGS \
                -lcudart -L $CUDA_PATH/lib64 
else
        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
                -I $TF_INC -fPIC $CXXFLAGS
fi

cd ..

#cd feature_extrapolating_layer

#nvcc -std=c++11 -c -o feature_extrapolating_op.cu.o feature_extrapolating_op_gpu.cu.cc \
#       -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_50

#g++ -std=c++11 -shared -o feature_extrapolating.so feature_extrapolating_op.cc \
#       feature_extrapolating_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
#cd ..

Could you help me, please.

YoungSharp commented 6 years ago

this is an error coursed by tensorflow op. you should check "https://github.com/tensorflow/tensorflow/blob/master/tensorflow/docs_src/extend/adding_an_op.md"
and check " Assuming you have g++ installed, here is the sequence of commands you can use to compile your op into a dynamic library.

TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') ) TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') ) g++ -std=c++11 -shared zero_out.cc -o zero_out.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 " this maybe solve your problem

roachsinai commented 6 years ago

@YoungSharp thanks for your help. It does help me. But I got a new error...

roachsinai commented 6 years ago

pyg ./tools/demo.py --model VGGnet_fast_rcnn_iter_70000.ckpt

/usr/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Tensor("Placeholder:0", shape=(?, ?, ?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_conv/3x3/rpn_conv/3x3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rpn_cls_score/rpn_cls_score:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_cls_prob:0", shape=(?, ?, ?, ?), dtype=float32)
Tensor("rpn_cls_prob_reshape:0", shape=(?, ?, ?, 18), dtype=float32)
Tensor("rpn_bbox_pred/rpn_bbox_pred:0", shape=(?, ?, ?, 36), dtype=float32)
Tensor("Placeholder_1:0", shape=(?, 3), dtype=float32)
Tensor("conv5_3/conv5_3:0", shape=(?, ?, ?, 512), dtype=float32)
Tensor("rois:0", shape=(?, 5), dtype=float32)
[<tf.Tensor 'conv5_3/conv5_3:0' shape=(?, ?, ?, 512) dtype=float32>, <tf.Tensor 'rois:0' shape=(?, 5) dtype=float32>]
Tensor("fc7/fc7:0", shape=(?, 4096), dtype=float32)

Loaded network VGGnet_fast_rcnn_iter_70000.ckpt
cudaCheckError() failed : no kernel image is available for execution on the device

@YoungSharp as you can see, if I run demo.py on gpu got the cudaCheckError. And solved it by change sm_37 to sm_30. Maybe cause my nvidia Compute Capability is 3.0.

KevinQian97 commented 6 years ago

@roachsinai @YoungSharp Hi I am encountered a similar problem with the undefined symbol: _ZN10tensorflow7strings6StrCatERKNS08AlphaNumES3 I noticed YoungSharp gave a method to fix it, but I can't find zero_out.cc -o zero_out.so,and the webpage can't open normally. Would you mind telling me how to solve the problem? Thanks you so much

roachsinai commented 6 years ago

@KevinQian97

I didn't use the option like zero_out.cc.

There is the works lib/make.sh for tensorflow 1.6 at my laptop before.

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )
TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )

CUDA_PATH=/opt/cuda/
CXXFLAGS=''

if [[ "$OSTYPE" =~ ^darwin ]]; then
        CXXFLAGS+='-undefined dynamic_lookup'
fi

cd roi_pooling_layer

if [ -d "$CUDA_PATH" ]; then
        nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
                -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS --expt-relaxed-constexpr\
                -arch=sm_30

        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc  -D_GLIBCXX_USE_CXX11_ABI=0  \
                roi_pooling_op.cu.o -I $TF_INC -I $TF_INC/external/nsync/public  -L $TF_LIB -D GOOGLE_CUDA=1  -ltensorflow_framework -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 \
                -lcudart -L $CUDA_PATH/lib64

#        g++ -std=c++11 -shared roi_pooling.cc -o roi_pooling.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2
else
        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
                -I $TF_INC -fPIC $CXXFLAGS
fi

cd ..

#cd feature_extrapolating_layer

#nvcc -std=c++11 -c -o feature_extrapolating_op.cu.o feature_extrapolating_op_gpu.cu.cc \
#       -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_50

#g++ -std=c++11 -shared -o feature_extrapolating.so feature_extrapolating_op.cc \
#       feature_extrapolating_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
#cd ..

The setting -arch=sm_30 is because my gpu compute capability is 3.0(maybe).

lixingwei1106 commented 5 years ago

I solved it with you method!

WOM89757 commented 3 years ago

@KevinQian97

I didn't use the option like zero_out.cc.

There is the works lib/make.sh for tensorflow 1.6 at my laptop before.

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )
TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )

CUDA_PATH=/opt/cuda/
CXXFLAGS=''

if [[ "$OSTYPE" =~ ^darwin ]]; then
        CXXFLAGS+='-undefined dynamic_lookup'
fi

cd roi_pooling_layer

if [ -d "$CUDA_PATH" ]; then
        nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
                -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS --expt-relaxed-constexpr\
                -arch=sm_30

        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc  -D_GLIBCXX_USE_CXX11_ABI=0  \
                roi_pooling_op.cu.o -I $TF_INC -I $TF_INC/external/nsync/public  -L $TF_LIB -D GOOGLE_CUDA=1  -ltensorflow_framework -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 \
                -lcudart -L $CUDA_PATH/lib64

#        g++ -std=c++11 -shared roi_pooling.cc -o roi_pooling.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2
else
        g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
                -I $TF_INC -fPIC $CXXFLAGS
fi

cd ..

#cd feature_extrapolating_layer

#nvcc -std=c++11 -c -o feature_extrapolating_op.cu.o feature_extrapolating_op_gpu.cu.cc \
#       -I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_50

#g++ -std=c++11 -shared -o feature_extrapolating.so feature_extrapolating_op.cc \
#       feature_extrapolating_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64
#cd ..

The setting -arch=sm_30 is because my gpu compute capability is 3.0(maybe).

@roachsinai Solved it, thank you so much!

smallcorgi / Faster-RCNN_TF

change $ROOT/tool/make.sh to fit tenssorflow 1.6 #284