An issue found during experiments with Halide on RISC-V chip AllWinner D1 which supports only RVV 0.7.1. See https://github.com/halide/Halide/discussions/7252 for details, but these are full steps to reproduce:
While RVV for planar input buffer is fine, there is a performance issue for the interleaved input. Not sure that issue with RDom, but I use it to avoid whole-register load/store instructions from RVV 1.0. Attaching generated assembly because it may be useful (I don't understand it that's why asking for help here).
An issue found during experiments with Halide on RISC-V chip AllWinner D1 which supports only RVV 0.7.1. See https://github.com/halide/Halide/discussions/7252 for details, but these are full steps to reproduce:
LLVM https://github.com/dkurt/llvm-rvv-071/tree/rvv-071 (based on
releases/16.x
branch)Halide https://github.com/halide/Halide/commit/7963cd4e3c23856b82567c99e0a3d16035ffe895 with patch to disable
vle64.v
andvse64.v
:patch
```patch diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt index 4f4b8e532..8f401c442 100644 --- a/src/CMakeLists.txt +++ b/src/CMakeLists.txt @@ -540,7 +540,7 @@ endif () if (BUILD_SHARED_LIBS) message(STATUS "Building autoschedulers enabled") - add_subdirectory(autoschedulers) + # add_subdirectory(autoschedulers) else () message(STATUS "Building autoschedulers disabled (static Halide)") endif () diff --git a/src/CodeGen_RISCV.cpp b/src/CodeGen_RISCV.cpp index ba9abe04d..454558d11 100644 --- a/src/CodeGen_RISCV.cpp +++ b/src/CodeGen_RISCV.cpp @@ -151,6 +151,7 @@ string CodeGen_RISCV::mattrs() const { arch_flags += ",+zvl" + std::to_string(target.vector_bits) + "b"; } #endif + arch_flags += ",-zve64x"; } return arch_flags; } ```main.cpp
```cpp #includeWhile RVV for planar input buffer is fine, there is a performance issue for the interleaved input. Not sure that issue with RDom, but I use it to avoid whole-register load/store instructions from RVV 1.0. Attaching generated assembly because it may be useful (I don't understand it that's why asking for help here).
vectorize(x, 8)
vectorize(x, 8)
RVV 0.7.1 spec RVV 1.0 spec