halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.86k stars 1.07k forks source link

Anderson2021 autoscheduler fails on inputs larger than 199993 elements #8252

Closed jansel closed 4 months ago

jansel commented 4 months ago

This repro below is a memcopy, but I see the same behavior (with a different error threshold) if I insert pointwise ops into the kernel as well.

repro.py

import halide as hl

N = 199993

@hl.generator(name="kernel")
class Kernel:
    in_ptr0 = hl.InputBuffer(hl.Float(32), 1)
    out_ptr0 = hl.OutputBuffer(hl.Float(32), 1)

    def generate(g):
        in_ptr0 = g.in_ptr0
        out_ptr0 = g.out_ptr0
        xindex = hl.Var("xindex")
        out_ptr0[xindex] = in_ptr0[xindex]

        assert g.using_autoscheduler()
        in_ptr0.set_estimates([hl.Range(0, N)])
        out_ptr0.set_estimates([hl.Range(0, N)])

if __name__ == "__main__":
    import sys, tempfile

    with tempfile.TemporaryDirectory() as out:
        sys.argv = [
            "repro.py",
            "-g",
            "kernel",
            "-o",
            out,
            "-f",
            "halide_kernel",
            "-e",
            "static_library,h,schedule",
            "-p",
            "/home/jansel/conda/envs/pytorch/lib/libautoschedule_anderson2021.so",
            "target=host-cuda-cuda_capability_86-user_context-strict_float-no_asserts",
            "autoscheduler=Anderson2021",
        ]
        hl.main()

Output:

Unhandled exception: Internal Error at /home/jansel/Halide/src/autoschedulers/anderson2021/LoopNest.cpp:1571 triggered by user code at : Condition failed: std::abs(bounds->region_required(i).min()) < 100000: region_required min = 100000; region_required max = 100007

Traceback (most recent call last):
  File "/home/jansel/pytorch/repro.py", line 40, in <module>
    hl.main()
RuntimeError: Generator failed: -1

If I reduce N to:

N = 199992

it works correctly.

Looking at the source code, I suspect this assert can just be deleted: https://github.com/halide/Halide/blob/711dc88a3c718033ff66f48584121f06536f63e7/src/autoschedulers/anderson2021/LoopNest.cpp#L1569-L1573

jansel commented 4 months ago

8253 removes the failing assert