Open bernhardmgruber opened 3 years ago
So while HIP did not mention texture support in their documentation, the functionality seems to be there: https://github.com/ROCm-Developer-Tools/HIP/blob/main/include/hip/hcc_detail/texture_functions.h
I agree with your assessment. I do not think textures are that widely used in computational applications nowadays, as there are now for a long time caches on GPUs (was one of the reasons to use textures for computations in early CUDA days), and their operations like interpolation have limited accuracy. However emulating while I think not that difficult to do to make it just work, without performance requirements, would still require continuous maintenance.
We opened a GSoC position for this feature: https://www.casus.science/news-events/events/google-summer-of-code-2021/#anchor-6
Dear all, I firmly believe this is a side quest. I think there is more important stuff to do.
While this might be a less important task for the overall goal of alpaka, ISAAC would definitely benefit from that.
While this might be a less important task for the overall goal of alpaka, ISAAC would definitely benefit from that.
To give it a little bit more context: In ISAAC we can have the case that we visualize multiple data sources with different resolutions within the same kernel. Accessing the data in a texture-like way with normalized indices and automatic interpolation is simplifying the ray casting kernel.
Maybe we can propagate work at some point from ISAAC back into alpaka.
ISAAC would greatly benefit from textures. The addressing is not really a problem, as it can easily be emulated with minimal overhead. Bigger problems, which can be solved with a proper native texture support are:
How long would a texture imp in Alpaka take? Can we test the perf gain by trying it in a CUDA only branch for ISAAC?
Right now I'm trying to integrate the native cuda textures in ISAAC, that I can hopefully include some performance numbers in my master thesis. And as @psychocoderHPC said, maybe we can propagate some of the work to alpaka, as I need to implement a software emulation for all non cuda capable architectures anyway
Here is how I envisioned the design of an image accessor:
using Image = cudaTextureObject_t; // we likely need an Image type
template<typename TElem, typename TBufferIdx, typename TAccessModes>
struct Accessor<Image, TElem, TBufferIdx, 2, TAccessModes> {
// Vec subscript to be compatible with buffer accessor
ALPAKA_FN_HOST_ACC auto operator[](Vec<DimInt<2>, TBufferIdx> i) const -> TElem {
return (*this)(i[0], i[1]);
}
// integral call operator to be compatible with buffer accessor, does texel fetch
ALPAKA_FN_HOST_ACC auto operator()(TBufferIdx y, TBufferIdx x) const -> TElem {
return tex1Dfetch<TElem>(texObj, y * rowPitchInValues + x);
}
// floating-point call operator for interpolated access
ALPAKA_FN_HOST_ACC auto operator()(float y, float x) const -> TElem {
return tex2D<TElem>(texObj, x, y);
}
Image texObj;
TBufferIdx rowPitchInValues; // for texel fetch
Vec<DimInt<2>, TBufferIdx> extents; // compatibility with buffer accessor
};
TAccessMode
probably just allows alpaka::ReadOnly
.
Alpaka currently lacks support for texture/image capabilities of certain backends. This currently concerns the CUDA backend and the currently developed SYCL backend. Texture/image support was also requested in: https://github.com/alpaka-group/alpaka/issues/1065 The discussion also came up during the prototyping of kernel side accessors to buffers: https://github.com/alpaka-group/alpaka/issues/38 and https://github.com/alpaka-group/alpaka/pull/1249
Since backend support for this feature is scarce, we have two options to implement such a facility:
alpaka::Buf
While option 1 is certainly doable, given that only CUDA supports this feature, we might run into a situation where the feature performs suboptimally on non-CUDA backends, because we might not pick the right emulation approach for everyone. E.g. is Z-order storage really the best memory layout? How about weird texture formats (see: https://sycl.readthedocs.io/en/latest/iface/image.html#sycl-image-channel-order)? Bilinear/trilinear interpolation on access? Edge behavior? Normalized texture coordinates? There is a lot we could get wrong or at least bad.
Option 2 is safe from our perspective, but locks users into CUDA (and later SYCL) when they use the feature. So as it stands now they could just use CUDA directly.
We could also mix the options and just provide a very limited texture/image support that we are confident we can emulate.
What is the strategy to go forward wrt. texture/image support?