[CUDA] cannot find matching tex2D

llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.

http://llvm.org

Other

28.97k stars 11.94k forks source link

[CUDA] cannot find matching tex2D #26774

Open llvmbot opened 8 years ago

llvmbot commented 8 years ago


Bugzilla Link	26400
Version	trunk
OS	All
Reporter	LLVM Bugzilla Contributor
CC	@DougGregor,@hfinkel,@jlebar,@kalvdans,@Artem-B

Extended Description

texture<float4, 2, cudaReadModeElementType> texA;

global void readTexels(float d_out) { float4 v = tex2D(texA, 0.0f, 0.0f); d_out = v.x + v.y + v.z + v.w; }

$ clang++ texture.cu -c --cuda-gpu-arch=sm_35

texture.cu:4:14: error: no matching function for call to 'tex2D' float4 v = tex2D(texA, 0.0f, 0.0f); ^~~~~ /usr/local/cuda/include/texture_indirect_functions.h:522:39: note: candidate template ignored: couldn't infer template argument 'T' __TEXTURE_INDIRECT_FUNCTIONS_DECL__ T tex2D(cudaTextureObject_t texObject, float x, float y) ...

nvcc can compile this file. The declaration of the matching tex2D is in texture_fetch_functions.h, and its definition is in texture_fetch_functions.hpp.

Artem-B commented 3 years ago

We may finally get textures working. https://reviews.llvm.org/D110089

jlebar commented 8 years ago

Do you know how __ftexfetch2D and like are defined in nvcc? Are they defined as compiler builtins or in some libraries? Thanks.

This is basically the same question, afaiu. Absent documentation (which I can't seem to find) I don't know how to directly tell whether the fns are builtins in nvcc; I can only deduce such by looking for a library implementation and not finding it. Having not been able to find a library implementation, I conclude that either one exists that I can't find, or nvcc uses builtins to implement these.

tra just told me that he thinks nvcc does use builtins for texture support. He also said that there's some magic wrt making the host and device types for the textures match up; that may or may not work currently.

llvmbot commented 8 years ago

+Justin Holewinski

Hi Justin,

Do you know how __ftexfetch2D and like are defined in nvcc? Are they defined as compiler builtins or in some libraries? Thanks.

jlebar commented 8 years ago

I don't understand why __ftexfetch2D should be an intrinsic. It is declared as an external function. How can that be a compile intrinsic?

Compiler magic? For example, sin() can be declared as an external function, but it gets special handling in the compiler, in its "intrinsic" path.

Maybe these functions are in fact defined somewhere; if so this could be a relatively simple fix. I just can't seem to find them in the headers or libdevice, and I don't know where else to look.

llvmbot commented 8 years ago

I encountered this when trying building SHOC with clang++. SHOC is a benchmark suite we demonstrate in the gpucc paper. So, it's not blocking me, but I would love to see this fixed before the CGO in March. I am willing to fix it if you don't have cycles.

I may be missing something, but I don't understand why __ftexfetch2D should be an intrinsic. It is declared as an external function. How can that be a compile intrinsic?

jlebar commented 8 years ago

I may be wrong, but it seems like we're missing some builtins. Bummer.

Is this blocking you? If not we probably have bigger fish to fry at the moment, at least until thrust and tensorflow compile properly and emit reasonably-performing code.

jlebar commented 8 years ago

OK, well, problem #1 is that we're just not including texture_fetch_functions.h in __clang_cuda_runtime_wrapper.h.

But even with that fixed, we're still in trouble because now we can't find the definition of

float4 __ftexfetch1D<texture<float4, 2, (cudaTextureReadMode)0> >(texture<float4, 2, (cudaTextureReadMode)0>, float4)

when assembling. I'm not sure yet where that's supposed to come from; it doesn't seem to be defined in libdevice or in the headers. I sure hope it's not a compiler builtin.

llvmbot commented 8 years ago

I believe

TEXTURE_FUNCTIONS_DECL float4 tex2D(texture<float4, cudaTextureType2D, cudaReadModeElementType> t, float x, float y) { float4 v = __ftexfetch(t, make_float4(x, y, 0, 0));

return make_float4(v.x, v.y, v.z, v.w); }

in texture_fetch_functions.hpp is the definition this example is trying to call.

jlebar commented 8 years ago

This is the definition of tex2D we're trying to call?

template __TEXTURE_INDIRECT_FUNCTIONS_DECL__ T tex2D(cudaTextureObject_t texObject, float x, float y) { T ret; tex2D(&ret, texObject, x, y); return ret; }

If so, how is the compiler supposed to infer the type of T if you don't specify it when making the call?

llvmbot commented 8 years ago

assigned to @Artem-B