halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.91k stars 1.07k forks source link

Compilation failure with RISCV+RVV vectors #8455

Closed steven-johnson closed 2 weeks ago

steven-johnson commented 2 weeks ago

Still debugging this, but here is a simple repro case that fails at Halide compilation time:

int main(int argc, char **argv) {
    Var x{"x"}, y{"y"}, c{"c"};

    ImageParam input(UInt(8), 3, "input");

    Func output("output");
    output(x, y, c) = input(x, y, c);

    input.dim(0).set_bounds(0, 48)
         .dim(1).set_bounds(0, 1)
         .dim(2).set_bounds(0, 3);

    output.output_buffer()
        .dim(0).set_stride(3).set_bounds(0, 48)
        .dim(1).set_bounds(0, 1)
        .dim(2).set_stride(1).set_bounds(0, 3);

    Target t("riscv-64-android-no_runtime-no_asserts-rvv-vector_bits_128");

    output.reorder(c, x, y)
        .vectorize(x, t.natural_vector_size<uint8_t>())
        .unroll(c);

    std::map<OutputFileType, std::string> outputs = {
        {OutputFileType::static_library, "/tmp/foo.a"},
        {OutputFileType::llvm_assembly, "/tmp/foo.ll"},
    };
    output.compile_to(outputs, output.infer_arguments(), "", t);

    return 0;
}

Failure is in

steven-johnson commented 2 weeks ago

Failure is in CodeGen_LLVM::shuffle_vectors(): the types don't match. But the real issue seems to be in slice_vectors for the non-fixed case. More info to come soon.

steven-johnson commented 2 weeks ago

OK, at least one of the issues here is that slice_vector isn't honoring its contract for non-fixed vectors. If you call slice_vector(<vscale x 8 x i8>, 0, 48), the function docstring suggests you should get back a <vscale x 48 x i8>, padded with undefs; however, if effective_vscale is 2, we get back a <vscale x 24 x i8> which seems to violate the contract.

EDIT: I guess you could argue that the result is 'correct', but some downstream code doesn't know how to deal with this; case in point, if you are calling slice_vector from interleave_vectors() just to hand the result to shuffle_vectors() (as in this case), we check that the vector types are identical, which they won't be in this case (one will be vscale and the other won't).

EDIT #2: Moving the assert to the end of the function (after we've normalized both vectors to fixed) looks seductive), and indeed, it makes this specific crash go away... however, we fail later on in codegen (in LLVM19) with LLVM ERROR: Don't know how to widen the operands for INSERT_SUBVECTOR

steven-johnson commented 2 weeks ago

foo.ll.zip

I think the remaining bug here is just an LLVM bug we need to report, I'll do that now: https://github.com/llvm/llvm-project/issues/114900