amiltonwong commented 4 years ago

Hi, @hehefan ,

Thanks for releasing such a useful package. However, there's lack of guide on how to use it. Could you provide some guide for it (such as visualization on the prediction, training steps, etc.)?

THX!

hehefan commented 4 years ago

Hi @amiltonwong,

You need to compile the operations in "modules/tf_ops" by executing the "make" command. The " CUDA_HOME" and "CUDNN_HOME" in Makefiles should be set in each directory before compiling.
The visualization script is "scripts/visualization.py", which is used for real-world dataset visualization. For this script, you need a Ubuntu Desktop and MayaVi should be installed. This script takes npy files as inputs.
For Moving MNIST Point Cloud training, just directly run "train-mmnist.py". For Argoverse and nuScenes training, you first need to extract point cloud by yourself and then run "train-argo-nu.py". The "scripts/extract-nu-pc.py" provides the script to extract point clouds from nuScenes. For Argoverse, you can directly obtain point clouds in the "lidar" directories.

Best regards.

IdeGelis commented 4 years ago

Hello @hehefan,

First thank you for releasing your code!

I'm trying to use it and first I am trying to compile operations in modules/tf_ops. I change the " CUDA_HOME" and "CUDNN_HOME" in Makefiles and checked the architecture.

For every operation, except 3d_interpolation, I get the following errors: make: Dépendance circulaire tf_nndistance_g.cu <- tf_nndistance_g.cu.o abandonnée. /usr/include/c++/8/type_traits(1049): error: type name is not allowed /usr/include/c++/8/type_traits(1049): error: type name is not allowed /usr/include/c++/8/type_traits(1049): error: identifier "__is_assignable" is undefined 3 errors detected in the compilation of "/tmp/tmpxft_000039a3_00000000-6_tf_nndistance_g.cpp1.ii".

Do you have any idea about what is happening?

I use tensorflow 1.9.0, cuda 9.0, gcc 8.3.

Thank you.

hehefan commented 4 years ago

Hi @IdeGelis,

I have not compiled the operations with gcc 8.3. Could you please try gcc 5.4?

Best regards.

IdeGelis commented 4 years ago

Hi,

Yes, I moved to gcc 6.3 and with little modification in makefiles, it worked! Thank you!

Regards.

hehefan commented 4 years ago

Cheers! @IdeGelis

Philipp03 commented 4 years ago

Hi @hehefan ,

like IdeGelis I'm trying to reproduce your interesting result. After changing the " CUDA_HOME" and "CUDNN_HOME" in Makefiles I'm trying to compile operations in modules/tf_ops.

Unfortunately I'm getting for all modules the following error: `C:/Users/localuser/Anaconda3/envs/tensorflow/lib/site-packages/tensorflow/include/tensorflow/core/framework/op_def.pb.h:9:42: fatal error: google/protobuf/stubs/common.h: No such file or directory

include <google/protobuf/stubs/common.h>

compilation terminated. make: *** [Makefile:17: tf_interpolate_so.so] Error 1`

I've already read some related posts such as https://github.com/protocolbuffers/protobuf/issues/5805 , where it' s stated that this issue should be solved by using the cuda update.

I'm using tensorflow 1.13.1, gcc 6.3, cuda 10.1 update2 .

hehefan commented 4 years ago

Hi @Philipp03 ,

Could you please have a try on tensorflow 1.9 or tensorflow 1.12 ?

Philipp03 commented 4 years ago

Hi @hehefan ,

I gave it a try. With tensorflow 1.12 the error still occurs. With tf 1.9, cuda 10.1 update 2 and gcc 5.4 I'm getting another error:

g++ -std=c++11 -shared -fPIC -o tf_interpolate_so.so tf_interpolate.cpp -IC:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include -IC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/include -L-LC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/lib -LC:/Programme/NVIDIA_GPU_Computing_Toolkit/CUDA/v10.1/lib/x64 -LC:/Users/pmerk/cuda/lib -LC:/Users/pmerk/cuda/lib/x64 -lcudart -L C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow -ltensorflow_framework -lcublas -O2 -D_GLIBCXX_USE_CXX11_ABI=0
tf_interpolate.cpp:1:0: warning: -fPIC ignored for target (all code is position independent)
 #include <cstdio>
 ^
In file included from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/mutex.h:31:0,
                 from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/op.h:32,
                 from tf_interpolate.cpp:6:
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:143:8: error: 'cv_status' in namespace 'std' does not name a type
   std::cv_status wait_for(mutex_lock& lock,
        ^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:158:8: error: 'cv_status' in namespace 'std' does not name a type
   std::cv_status wait_until_system_clock(
        ^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h: In function 'tensorflow::ConditionResult tensorflow::WaitForMilliseconds(tensorflow::mutex_lock*, tensorflow::condition_variable*, tensorflow::int64)':
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:166:3: error: 'cv_status' is not a member of 'std'
   std::cv_status s = cv->wait_for(*mu, std::chrono::milliseconds(ms));
   ^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:167:11: error: 's' was not declared in this scope
   return (s == std::cv_status::timeout) ? kCond_Timeout : kCond_MaybeNotified;
           ^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/mutex.h:167:21: error: 'std::cv_status' has not been declared
   return (s == std::cv_status::timeout) ? kCond_Timeout : kCond_MaybeNotified;
                     ^
In file included from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/notification.h:26:0,
                 from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/lib/core/notification.h:21,
                 from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/cancellation.h:22,
                 from C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:24,
                 from tf_interpolate.cpp:7:
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h: In member function 'bool tensorflow::Notification::WaitForNotificationWithTimeout(tensorflow::int64)':
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h:69:20: error: 'class tensorflow::condition_variable' has no member named 'wait_for'
                cv_.wait_for(l, std::chrono::microseconds(timeout_in_us)) !=
                    ^
C:/Users/localuser/Anaconda3/envs/py36/lib/site-packages/tensorflow/include/tensorflow/core/platform/default/notification.h:70:25: error: 'std::cv_status' has not been declared
                    std::cv_status::timeout);
                         ^
make: *** [Makefile:17: tf_interpolate_so.so] Error 1

hehefan commented 4 years ago

Hi @Philipp03 ,

For TF 1.9, please have a try on gcc 5.4.0, CUDA 9.0. It there are still some errors, please try to use virturalevn to build the environment, rather than anaconda.

Philipp03 commented 4 years ago

I tried it with CUDA 9.0 and after that failed I tried to use virtualenv to build the environment still getting the same error from above

IdeGelis commented 4 years ago

Hi @Philipp03, To make it work on gcc 6.3 I removed -D_GLIBCXX_USE_CXX11_ABI=0 in every Makefile. Regards

ZhaoPengpeng1116 commented 3 years ago

Is there a Windows version of how to use it? Please.

hehefan commented 3 years ago

Hi @ZhaoPengpeng1116 .

I am sorry that there has not been a Windows version. I strongly suggest you set up a Ubuntu desktop, which is also convenient for visualization.

Thank you.

ZhaoPengpeng1116 commented 3 years ago

Thank you very much, I have compiled the .so files on Ubuntu. Thank you!

Jagyan commented 3 years ago

Hi @amiltonwong,

You need to compile the operations in "modules/tf_ops" by executing the "make" command. The " CUDA_HOME" and "CUDNN_HOME" in Makefiles should be set in each directory before compiling.

The visualization script is "scripts/visualization.py", which is used for real-world dataset visualization. For this script, you need a Ubuntu Desktop and MayaVi should be installed. This script takes npy files as inputs.

For Moving MNIST Point Cloud training, just directly run "train-mmnist.py". For Argoverse and nuScenes training, you first need to extract point cloud by yourself and then run "train-argo-nu.py". The "scripts/extract-nu-pc.py" provides the script to extract point clouds from nuScenes. For Argoverse, you can directly obtain point clouds in the "lidar" directories.

Best regards.

When you say that you first need to extract pointcloud by yourself for nuscenes, do you mean just extract from every file in the lidar folder and store it in trainval and test folder? If you do that, when do you synchronize the pointclouds? Also, you have not calibrated the pointclouds too, where do you do that?

dfldylan commented 3 years ago

I met an error that "Could not create cudnn handle: CUDNN_STATUS_NOT_INITIALIZED, Possibly insufficient driver version: 418.152.0" It looks like that cuda9.0 is supported by driver418. I run the codes in a docker "tensorflow/tensorflow:1.12.0-gpu-py3", and I changed cuDNN from 7.0 to 7.4.2, which is didn't work all. Could you please tell me the accurate version of cuda, cudnn, tensorflow-gpu, and the nvidia driver? Thanks!

dfldylan commented 3 years ago

By the way, I noticed that Makefiles of tf_ops say cuda_path=/usr/local/cuda-9.0 cudnn_path=/usr/local/cudnn7.4-9.0 It is usually method of installing cudnn that put the "include"and "lib64"folders in cuda's path(/usr/local/cuda-9.0) , so why seperate the folders?

keroseus commented 3 days ago

hello , when I try to run train-mnist.py in autodl,it says ' /root/PointRNN-master/modules/tf_ops/sampling/tf_sampling_so.so: cannot open shared object file: No such file or directory',what should I do about this issue?

hehefan / PointRNN

about usage #4

include <google/protobuf/stubs/common.h>