Open llvmbot opened 8 years ago
We may finally get textures working. https://reviews.llvm.org/D110089
Do you know how __ftexfetch2D and like are defined in nvcc? Are they defined as compiler builtins or in some libraries? Thanks.
This is basically the same question, afaiu. Absent documentation (which I can't seem to find) I don't know how to directly tell whether the fns are builtins in nvcc; I can only deduce such by looking for a library implementation and not finding it. Having not been able to find a library implementation, I conclude that either one exists that I can't find, or nvcc uses builtins to implement these.
tra just told me that he thinks nvcc does use builtins for texture support. He also said that there's some magic wrt making the host and device types for the textures match up; that may or may not work currently.
+Justin Holewinski
Hi Justin,
Do you know how __ftexfetch2D and like are defined in nvcc? Are they defined as compiler builtins or in some libraries? Thanks.
I don't understand why __ftexfetch2D should be an intrinsic. It is declared as an external function. How can that be a compile intrinsic?
Compiler magic? For example, sin() can be declared as an external function, but it gets special handling in the compiler, in its "intrinsic" path.
Maybe these functions are in fact defined somewhere; if so this could be a relatively simple fix. I just can't seem to find them in the headers or libdevice, and I don't know where else to look.
I encountered this when trying building SHOC with clang++. SHOC is a benchmark suite we demonstrate in the gpucc paper. So, it's not blocking me, but I would love to see this fixed before the CGO in March. I am willing to fix it if you don't have cycles.
I may be missing something, but I don't understand why __ftexfetch2D should be an intrinsic. It is declared as an external function. How can that be a compile intrinsic?
I may be wrong, but it seems like we're missing some builtins. Bummer.
Is this blocking you? If not we probably have bigger fish to fry at the moment, at least until thrust and tensorflow compile properly and emit reasonably-performing code.
OK, well, problem #1 is that we're just not including texture_fetch_functions.h in __clang_cuda_runtime_wrapper.h.
But even with that fixed, we're still in trouble because now we can't find the definition of
float4 __ftexfetch1D<texture<float4, 2, (cudaTextureReadMode)0> >(texture<float4, 2, (cudaTextureReadMode)0>, float4)
when assembling. I'm not sure yet where that's supposed to come from; it doesn't seem to be defined in libdevice or in the headers. I sure hope it's not a compiler builtin.
I believe
TEXTURE_FUNCTIONS_DECL float4 tex2D(texture<float4, cudaTextureType2D, cudaReadModeElementType> t, float x, float y) { float4 v = __ftexfetch(t, make_float4(x, y, 0, 0));
return make_float4(v.x, v.y, v.z, v.w); }
in texture_fetch_functions.hpp is the definition this example is trying to call.
This is the definition of tex2D we're trying to call?
template
If so, how is the compiler supposed to infer the type of T if you don't specify it when making the call?
assigned to @Artem-B
Extended Description
texture<float4, 2, cudaReadModeElementType> texA;
global void readTexels(float d_out) { float4 v = tex2D(texA, 0.0f, 0.0f); d_out = v.x + v.y + v.z + v.w; }
$ clang++ texture.cu -c --cuda-gpu-arch=sm_35
texture.cu:4:14: error: no matching function for call to 'tex2D' float4 v = tex2D(texA, 0.0f, 0.0f); ^~~~~ /usr/local/cuda/include/texture_indirect_functions.h:522:39: note: candidate template ignored: couldn't infer template argument 'T' __TEXTURE_INDIRECT_FUNCTIONS_DECL__ T tex2D(cudaTextureObject_t texObject, float x, float y) ...
nvcc can compile this file. The declaration of the matching tex2D is in texture_fetch_functions.h, and its definition is in texture_fetch_functions.hpp.