halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.86k stars 1.07k forks source link

Autoschedulers should not be mutating the function dag by calling Internal::inline_function eagerly #8282

Closed abadams closed 3 months ago

abadams commented 3 months ago

In AutoScheduleUtils.cpp, there are two calls to Internal::inline_function that actually mutate the RHS of the caller function they are called on. Autoschedulers should not be doing this! They should just mark the callee as inlined.

abadams commented 3 months ago

Concrete failure caused by this (reduced from #8279):

#include "Halide.h"

using namespace Halide;

int main(int argc, char **argv) {
    ImageParam in1(Float(32), 1);

    Var x("x"), r("r"), u("u");
    RVar ro("ro"), ri("ri");
    Func f("f"), g("g");
    Func sum1("sum1");

    RDom rdom(0, 16);
    g(r, x) = in1(x);
    f(x) = sum(rdom, g(rdom, x), sum1);

    {
        using namespace Halide::Internal;
        std::vector<Function> outputs = {f.function()};
        auto env = build_environment(outputs);

        for (auto &iter : env) {
            iter.second.lock_loop_levels();
        }

        inline_function(sum1.function(), g.function());
    }

    sum1.compute_root()
        .update(0)
        .split(rdom, ro, ri, 8, TailStrategy::GuardWithIf)
        .rfactor({{ro, u}})
        .compute_root();

    f.compile_jit(Target{"host-no_asserts"});

    return 0;
}

It probably fails because the inlining removes all reference to the RDom in the summation, so the rfactor gets confused somehow.

abadams commented 3 months ago

While it's true that the autoschedulers shouldn't call inline_function, doing that should produce a valid Halide pipeline. I think something might be bad about the implementation of inline_function if it's producing something that causes havoc with rfactor.

abadams commented 3 months ago

Root cause: rfactor defines an intermediate function from scratch. If the original function had an RDom with no reference to that RDom on the LHS or RHS (which can't be done from the front-end, but which inline_function can do), the intermediate function doesn't get the RDom attached to it, but the rvars are still placed in the dims list manually by rfactor. This caused codegen issues where the rvar names are referred to without being defined in bounds inference. The solution I'll go for is to make it possible to pass an explicit RDom to Function::define_update