No OpKernel issue on FarthestPointSample

Kaiwind88 commented 6 years ago

Hi Charlers, When I tried to run your train.py code, I got some error. It seems FarthestPointSample is running on CPU not GPU, but I did not change anything about your codes. I am wondering if you have any suggestion about it? I looked at the cuda code, but I can not find any issue (I am pretty new about CUDA). I listed the error below.

Thank you.

Traceback (most recent call last): File "train.py", line 284, in train() File "train.py", line 160, in train sess.run(init) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run run_metadata_ptr) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run feed_dict_tensor, options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run options, run_metadata) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.InvalidArgumentError: No OpKernel was registered to support Op 'FarthestPointSample' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'

 [[Node: layer1/FarthestPointSample = FarthestPointSample[npoint=512, _device="/device:GPU:0"](Placeholder)]]

Caused by op u'layer1/FarthestPointSample', defined at: File "train.py", line 284, in train() File "train.py", line 121, in train pred, end_points = MODEL.get_model(pointclouds_pl, is_training_pl, bn_decay=bn_decay) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/models/pointnet2_cls_ssg.py", line 32, in get_model l1_xyz, l1_points, l1_indices = pointnet_sa_module(l0_xyz, l0_points, npoint=512, radius=0.2, nsample=32, mlp=[64,64,128], mlp2=None, group_all=False, is_training=is_training, bn_decay=bn_decay, scope='layer1', use_nchw=True) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/utils/pointnet_util.py", line 113, in pointnet_sa_module new_xyz, new_points, idx, grouped_xyz = sample_and_group(npoint, radius, nsample, xyz, points, knn, use_xyz) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/utils/pointnet_util.py", line 40, in sample_and_group new_xyz = gather_point(xyz, farthest_point_sample(npoint, xyz)) # (batch_size, npoint, 3) File "/home/kai/Documents/PointCloudsProcessing/pointnet2/tf_ops/sampling/tf_sampling.py", line 56, in farthest_point_sample return sampling_module.farthest_point_sample(inp, npoint) File "", line 46, in farthest_point_sample File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op original_op=self._default_original_op, op_def=op_def) File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in init self._traceback = self._graph._extract_stack() # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): No OpKernel was registered to support Op 'FarthestPointSample' with these attrs. Registered devices: [CPU], Registered kernels: device='GPU'

 [[Node: layer1/FarthestPointSample = FarthestPointSample[npoint=512, _device="/device:GPU:0"](Placeholder)]]

KamalM8 commented 6 years ago

I think the main reason for this error is that you are using a tensorflow-cpu installation. Reinstall using tensorflow-gpu instead

kingsvalley commented 6 years ago

@Kaiwind88 Have you solved the problem? i have encountered the same problem

merium commented 6 years ago

I'm using a tenserflow-gpu installation but I'm still getting the same error.

merium commented 6 years ago

This is actually a GPU problem. Run the tensorflow GPU test to make sure the test is passed. https://www.tensorflow.org/programmers_guide/using_gpu

shadowind commented 6 years ago

@kingsvalley Yes, I just found the solution as mentioned by @merium and @KamalM8 . I have tensorflow-cpu installed by chance, so the code does not recognize GPU. I install the gpu version, and it works.

frostinassiky commented 4 years ago

conda install tensorflow-gpu

DineshChandra94 commented 4 years ago

Check whether tensorflow is able to access gpu via these command in command line import tensorflow as tf tf.test.tf.is_gpu_available() You should get True

then check tf.test.is_built_with_cuda() here also you should get True

If you get False than you will have to install tensorflow-gpu pip install tensorflow-gpu==XXXX XXXX is the version in case you need to install any specific version

towardthesea commented 4 years ago

Hi, You should do the following is checks:

cuda version was used during compiling tensorflow-gpu. This link tells you the cuda version of pre-built tensorflow-gpu which you may get through pip.
cuda version you actually have in your system, which is normally at /usr/local/ If the two cuda versions are not the same. You are in bad luck, and it causes the error you see, which means that you install the right tensorflow with gpu support but the actual driver is not the right one to run.

Cheers.

caiobarrosv commented 3 years ago

If you have a single GPU, just set os.environ['CUDA_VISIBLE_DEVICES'] in main.py file to str(0)

os.environ['CUDA_VISIBLE_DEVICES'] = str(0)

and make sure you have tensorflow-gpu installed.

charlesq34 / pointnet2

No OpKernel issue on FarthestPointSample #33