Open amiltonwong opened 4 years ago
Hi @amiltonwong,
You need to compile the operations in "modules/tf_ops" by executing the "make" command. The " CUDA_HOME" and "CUDNN_HOME" in Makefiles should be set in each directory before compiling.
The visualization script is "scripts/visualization.py", which is used for real-world dataset visualization. For this script, you need a Ubuntu Desktop and MayaVi should be installed. This script takes npy files as inputs.
For Moving MNIST Point Cloud training, just directly run "train-mmnist.py". For Argoverse and nuScenes training, you first need to extract point cloud by yourself and then run "train-argo-nu.py". The "scripts/extract-nu-pc.py" provides the script to extract point clouds from nuScenes. For Argoverse, you can directly obtain point clouds in the "lidar" directories.
Best regards.
Hello @hehefan,
First thank you for releasing your code!
I'm trying to use it and first I am trying to compile operations in modules/tf_ops. I change the " CUDA_HOME" and "CUDNN_HOME" in Makefiles and checked the architecture.
For every operation, except 3d_interpolation, I get the following errors:
make: Dépendance circulaire tf_nndistance_g.cu <- tf_nndistance_g.cu.o abandonnée. /usr/include/c++/8/type_traits(1049): error: type name is not allowed /usr/include/c++/8/type_traits(1049): error: type name is not allowed /usr/include/c++/8/type_traits(1049): error: identifier "__is_assignable" is undefined 3 errors detected in the compilation of "/tmp/tmpxft_000039a3_00000000-6_tf_nndistance_g.cpp1.ii".
Do you have any idea about what is happening?
I use tensorflow 1.9.0, cuda 9.0, gcc 8.3.
Thank you.
Hi @IdeGelis,
I have not compiled the operations with gcc 8.3. Could you please try gcc 5.4?
Best regards.
Hi,
Yes, I moved to gcc 6.3 and with little modification in makefiles, it worked! Thank you!
Regards.
Cheers! @IdeGelis
Hi @hehefan ,
like IdeGelis I'm trying to reproduce your interesting result. After changing the " CUDA_HOME" and "CUDNN_HOME" in Makefiles I'm trying to compile operations in modules/tf_ops.
Unfortunately I'm getting for all modules the following error: `C:/Users/localuser/Anaconda3/envs/tensorflow/lib/site-packages/tensorflow/include/tensorflow/core/framework/op_def.pb.h:9:42: fatal error: google/protobuf/stubs/common.h: No such file or directory
^
compilation terminated. make: *** [Makefile:17: tf_interpolate_so.so] Error 1`
I've already read some related posts such as https://github.com/protocolbuffers/protobuf/issues/5805 , where it' s stated that this issue should be solved by using the cuda update.
I'm using tensorflow 1.13.1, gcc 6.3, cuda 10.1 update2 .
Hi @Philipp03 ,
Could you please have a try on tensorflow 1.9 or tensorflow 1.12 ?
Hi @hehefan ,
I gave it a try. With tensorflow 1.12 the error still occurs. With tf 1.9, cuda 10.1 update 2 and gcc 5.4 I'm getting another error:
g++ -std=c++11 -shared -fPIC -o tf_interpolate_so.so tf_interpolate.cpp -IC:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include -IC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/include -L-LC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/lib -LC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/lib/x64 -LC:/Users/pmerk/cuda/lib -LC:/Users/pmerk/cuda/lib/x64 -lcudart -L C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow -ltensorflow_framework -lcublas -O2 -D_GLIBCXX_USE_CXX11_ABI=0
tf_interpolate.cpp:1:0: warning: -fPIC ignored for target (all code is position independent)
#include <cstdio>
^
In file included from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/mutex.h:31:0,
from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/op.h:32,
from tf_interpolate.cpp:6:
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:143:8: error: 'cv_status' in namespace 'std' does not name a type
std::cv_status wait_for(mutex_lock& lock,
^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:158:8: error: 'cv_status' in namespace 'std' does not name a type
std::cv_status wait_until_system_clock(
^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h: In function 'tensorflow::ConditionResult tensorflow::WaitForMilliseconds(tensorflow::mutex_lock*, tensorflow::condition_variable*, tensorflow::int64)':
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:166:3: error: 'cv_status' is not a member of 'std'
std::cv_status s = cv->wait_for(*mu, std::chrono::milliseconds(ms));
^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:167:11: error: 's' was not declared in this scope
return (s == std::cv_status::timeout) ? kCond_Timeout : kCond_MaybeNotified;
^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:167:21: error: 'std::cv_status' has not been declared
return (s == std::cv_status::timeout) ? kCond_Timeout : kCond_MaybeNotified;
^
In file included from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/notification.h:26:0,
from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/lib/core/notification.h:21,
from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/cancellation.h:22,
from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:24,
from tf_interpolate.cpp:7:
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h: In member function 'bool tensorflow::Notification::WaitForNotificationWithTimeout(tensorflow::int64)':
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h:69:20: error: 'class tensorflow::condition_variable' has no member named 'wait_for'
cv_.wait_for(l, std::chrono::microseconds(timeout_in_us)) !=
^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h:70:25: error: 'std::cv_status' has not been declared
std::cv_status::timeout);
^
make: *** [Makefile:17: tf_interpolate_so.so] Error 1
Hi @Philipp03 ,
For TF 1.9, please have a try on gcc 5.4.0, CUDA 9.0. It there are still some errors, please try to use virturalevn to build the environment, rather than anaconda.
I tried it with CUDA 9.0 and after that failed I tried to use virtualenv to build the environment still getting the same error from above
Hi @Philipp03, To make it work on gcc 6.3 I removed -D_GLIBCXX_USE_CXX11_ABI=0 in every Makefile. Regards
Is there a Windows version of how to use it? Please.
Hi @ZhaoPengpeng1116 .
I am sorry that there has not been a Windows version. I strongly suggest you set up a Ubuntu desktop, which is also convenient for visualization.
Thank you.
Thank you very much, I have compiled the .so files on Ubuntu. Thank you!
Hi @amiltonwong,
- You need to compile the operations in "modules/tf_ops" by executing the "make" command. The " CUDA_HOME" and "CUDNN_HOME" in Makefiles should be set in each directory before compiling.
- The visualization script is "scripts/visualization.py", which is used for real-world dataset visualization. For this script, you need a Ubuntu Desktop and MayaVi should be installed. This script takes npy files as inputs.
- For Moving MNIST Point Cloud training, just directly run "train-mmnist.py". For Argoverse and nuScenes training, you first need to extract point cloud by yourself and then run "train-argo-nu.py". The "scripts/extract-nu-pc.py" provides the script to extract point clouds from nuScenes. For Argoverse, you can directly obtain point clouds in the "lidar" directories.
Best regards.
When you say that you first need to extract pointcloud by yourself for nuscenes, do you mean just extract from every file in the lidar folder and store it in trainval and test folder? If you do that, when do you synchronize the pointclouds? Also, you have not calibrated the pointclouds too, where do you do that?
I met an error that "Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED, Possibly insufficient driver version: 418.152.0" It looks like that cuda9.0 is supported by driver418. I run the codes in a docker "tensorflow/tensorflow:1.12.0-gpu-py3", and I changed cuDNN from 7.0 to 7.4.2, which is didn't work all. Could you please tell me the accurate version of cuda, cudnn, tensorflow-gpu, and the nvidia driver? Thanks!
By the way, I noticed that Makefiles of tf_ops say cuda_path=/usr/local/cuda-9.0 cudnn_path=/usr/local/cudnn7.4-9.0 It is usually method of installing cudnn that put the "include"and "lib64"folders in cuda's path(/usr/local/cuda-9.0) , so why seperate the folders?
hello , when I try to run train-mnist.py in autodl,it says ' /root/PointRNN-master/modules/tf_ops/sampling/tf_sampling_so.so: cannot open shared object file: No such file or directory',what should I do about this issue?
Hi, @hehefan ,
Thanks for releasing such a useful package. However, there's lack of guide on how to use it. Could you provide some guide for it (such as visualization on the prediction, training steps, etc.)?
THX!