Closed andrevitorelli closed 3 years ago
Example code 1:
import tensorflow as tf
from tensorflow_graphics.image.transformer import perspective_transform
import numpy as np
image=np.random.uniform(0,1,size=64*64)
transfmatrix = np.identity(3)
image.shape = (1,64,64,1)
transfmatrix.shape = (1,3,3)
imgtf = tf.convert_to_tensor(image)
tmtf = tf.convert_to_tensor(transfmatrix)
perspective_transform(imgtf,tmtf)
Example code 2:
import tensorflow as tf
import numpy as np
from tensorflow_addons.utils.resource_loader import LazySO
image=np.random.uniform(0,1,size=64*64)
transfmatrix = np.identity(2)
image.shape = (1,64,64,1)
transfmatrix.shape = (1,2,2)
imgtf = tf.convert_to_tensor(image)
tmtf = tf.convert_to_tensor(transfmatrix)
_resampler_so = LazySO("custom_ops/image/_resampler_ops.so")
_resampler_so.ops.addons_resampler(imgtf,tmtf)
In more detail, example code 2 fails with:
[...]_resampler_ops.so: undefined symbol:
_ZN10tensorflow15shape_inference16InferenceContext8SubshapeENS0_11ShapeHandleExxPS2_
oh boy >.< I don't know what would be going on..... maybe try to uninstall and reinstall... you seem to have the right version of CUDA for your TF version....
to make sure you all TF and extension libraries are compiled against each other
I did that so many times I lost count. I've even tried to install with older versions of tfa, tfg-nightly. It works in colab, and it works here in CPU. It's bizarre that it stopped working between last week and this one. In any case, I'll keep on using CPU to continue development. tks!
hummmmm.... I'm gonna try to update all libraries on my machine and see, maybe they broke the nighlties
Hummm I have updgraded tfa and tfg, and I see no issues on my GPU, a Titan X.
It looks to me like there is a cross-talk maybe between 2 different versions of TensorFlow on your system, which might explain why it doesnt find that symbol. But I've never seen that.
I have scrapped my entire python installation, cleaned it up, and rebuilt from scratch, but it still doesn't work. with tf-nightly-gpu it gives the symbol not found, with the tensorflow-gpu the ipython kernel dies with GpuLaunchKernel failing to find a CUDA kernel. I'll keep investigating.
oooh wow.... ok. so I wouldn't use tf nightly-gpu, just tensorflow 2.4.1, installed with
$ pip install tensorflow==2.4.1
$ pip install tfg-nightly tfa-nightly
tensorflow-gpu is deprecated.
But actually, coming back to your initial error, I think it only meant that your GPU is too old for the available kernel. Which is ok, we'll get you access to some sweet Nvidia V100s, and for now you can use the CPU backend locally.
I don't know why your follow up errors with missing symbols arose though....
@andrevitorelli did you resolve this in the end ^^'? Or is it really a problem specific to your local GPU?
No, I just let it as it is. For the time being, I'll test it on the CPU, but soon I hope to be using a better computer.
Ok, I'm going to close this issue for now. Hopefully won't be a problem when we can run on better GPUs
I know this is either galflow-specific or even more upstream issue, but I'll post it here for tracking. gf.shear fails with dead kernel, because perspective_transform from tfg-nightly fails. This, in turn, because _resampler_ops.so fails with message:
F tensorflow_addons/custom_ops/image/cc/kernels/resampler_ops_gpu.cu.cc:126] Non-OK-status: GpuLaunchKernel( Resampler2DKernel<T>, config.block_count, config.thread_per_block, 0, d.stream(), data, warp, output, batch_size, data_height, data_width, data_channels, num_sampling_points) status: Internal: no kernel image is available for execution on the device
This was not happening before, but seems to have started this week, after messing around with tensorflow-datasets installation, although I cannot see why.
hardware: GM108 cc: 5.0 driver: 460.32.03 cuda version: 11.0 tf: 2.4.1