No file named 'correlation.so'

KarelZhang commented 5 years ago

Hi, I am a new learner of tensorflow. I have installed the CUDA on my computer already, so I directily run the code. But it shows that there is no such file named 'correlation.so' What can I do? I have no idea about your instruction that build the CUDA, could you give me some help? Thanks!

Iamanorange commented 5 years ago

Install CUDA. Build the custom OPs. Run Flownet2.

barkerje commented 5 years ago

I am getting the same error. I think I am having trouble with the makefile. When I run make all I get the following:

nvcc -g -std=c++11 -I`python -c "import tensorflow; print(tensorflow.sysconfig.get_include())"` -I"/usr/local/cuda/include" -DGOOGLE_CUDA=1 -D_MWAITXINTRIN_H_INCLUDED -D_FORCE_INLINES -D__STRICT_ANSI__ -D_GLIBCXX_USE_CXX11_ABI=0 -c src/ops/preprocessing/kernels/data_augmentation.cu.cc -x cu -Xcompiler -fPIC  -o src/ops/build/data_augmentation.o
/anaconda/envs/py35/lib/python3.5/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
 from ._conv import register_converters as _register_converters
In file included from /anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_kernel_helper.h:21:0,
             from src/ops/preprocessing/kernels/data_augmentation.cu.cc:7:
/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_device_functions.h:32:31: fatal error: cuda/include/cuda.h: No such file or directory
compilation terminated.
Makefile:62: recipe for target 'preprocessing' failed
make: *** [preprocessing] Error 1

I think this is related to this problem on the tensorflow repo. I tried explicitly adding my cuda path to my make command (i.e. make all -I/usr/local) but that doesn't seem to fix it. Does something need to get edited in the make file?

Iamanorange commented 5 years ago

@barkerje This may be caused by wrong including path. See https://github.com/sampepose/flownet2-tf/issues/45#issuecomment-411713153 to edit Makefile; or https://github.com/sampepose/flownet2-tf/issues/41#issue-337716134 to edit CUDA header files.

JesperChristensen89 commented 5 years ago

Can you run this without CUDA? On the CPU?

Iamanorange commented 5 years ago

I'm afraid not. The OPs were written with CUDA.

barkerje commented 5 years ago

Thanks for the quick reply. I edited the Makefile as you suggested and that helped me move a step forward. However now whenever I run any of the Makefile lines starting with $(GPUCC) I get two sets of errors. One is pretty easy to fix:

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/include/tensorflow/core/util/cuda_device_functions.h(523): error: calling a constexpr __host__ function("real") from a __device__ function("CudaAtomicAdd") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

This one says what to do, I added --expt-relaxed-constexpr to the end of every line starting with $(GPUCC) and those errors disappeared. The other was a little harder:

/anaconda/envs/py35/lib/python3.5/site-packages/tensorflow/include/absl/strings/string_view.h(496): error: constexpr function return is non-constant

But I was able to find this thread on the tensorflow repo which somewhat reluctantly suggested adding -DNDEBUG as a stop gap solution. So now all my lines starting with $(GPUCC) now end with --expt-relaxed-constexpr -DNDEBUG and I can run make all.

However, that still doesn't seem to have to have totally solved the problem. Now when I try to run the test code I get a fun new error:

tensorflow.python.framework.errors_impl.NotFoundError: /flownet2-tf-master/src/./ops/build/correlation.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Any thoughts on what to try next?

Iamanorange commented 5 years ago

https://github.com/sampepose/flownet2-tf/issues/28#issuecomment-406941839

barkerje commented 5 years ago

Thanks, the compilation seems to work now. I am still getting errors, but they are different enough I will start a new thread.

KarelZhang commented 5 years ago

Install CUDA. Build the custom OPs. Run Flownet2.

Thanks, I had already solved the problem. But I got another question, have you ever tested the model on KITTI data set? I have tested and evaluated it on KITTI, but the result is not as good as the paper? Can you give some advice?

The AEE on paper is around 10, but mine is 26.

Best Wishes.

lijialei666 commented 5 years ago

Hello, I also meet the problem '' it shows that there is no such file named 'correlation.so' " Can you tell me the details for this problem's resolve. Thank you very much.

Iamanorange commented 5 years ago

First, you should make 'correlation.so' and other OPs.

KarelZhang commented 5 years ago

Some code is written by C++, so you need to build them. You can find the C++ code in the catalogue named "src/ops". The C++ code is supported by CUDA. So before you build them, you need to install CUDA.

lijialei666 commented 5 years ago

Thank you very much! I had resolve it.

Chenjiaxinxin commented 5 years ago

@KarelZhang How can i do to make correlation.so?I have installed cuda.Can you tell me the details?Thank you very much.

Iamanorange commented 5 years ago

Open terminal; Type make; Enter.

wh1stl3 commented 2 years ago

Open terminal; Type make; Enter.

Hello,sorry to bother you after all this time, but I ’m facing a different problem. When I make all ,I got the error like

nvcc -g -std=c++11 -I`python -c "import tensorflow; print(tensorflow.sysconfig.get_include())"` -I"/usr/local/cuda/.." -DGOOGLE_CUDA=1 -c src/ops/preprocessing/kernels/data_augmentation.cu.cc -x cu -Xcompiler -fPIC  -o src/ops/build/data_augmentation.o
nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
make: *** [Makefile:62: preprocessing] Error 1

What changes can be made to my make file? Best Wishes.

sampepose / flownet2-tf

No file named 'correlation.so' #65