halide / Halide

a language for fast, portable data-parallel computation
https://halide-lang.org
Other
5.86k stars 1.07k forks source link

fuzz-cse failure #8267

Open steven-johnson opened 3 months ago

steven-johnson commented 3 months ago

From internal testing, the enclosed fuzzer fails the fuzz_cse case with SIGABRT with a traceback of:

=================================================================
*** SIGABRT received by PID 2090517 (TID 2090517) on cpu 22 from PID 2090517; stack trace: ***
PC: @     0x7f3667d9d981  (unknown)  gsignal
    @     0x5609b165ecd3        288  base/process_state.cc:1237 FailureSignalHandler()
    @     0x5609b09f7b29        192  third_party/googlefuzztest/internal/runtime.cc:361 fuzztest::internal::HandleCrash()
    @     0x7f3667f10e80  1395162080  (unknown)
    @     0x5609b0a0c55b        176  third_party/googlefuzztest/internal/coverage.cc:170 fuzztest::internal::ExecutionCoverage::UpdateMaxStack()
    @     0x5609b0a0d8f9         48  third_party/googlefuzztest/internal/coverage.cc:389 __sanitizer_cov_trace_const_cmp4
    @     0x5609a98ff28b        128  third_party/halide/halide/src/Type.cpp:18 Halide::Type::can_represent()
    @     0x5609a98ff15f         32  third_party/halide/halide/src/Type.cpp:131 Halide::Type::can_represent()
    @     0x5609a8554d4e        160  third_party/halide/halide/src/ConstantInterval.cpp:207 Halide::Internal::ConstantInterval::cast_to()
    @     0x5609a8ed3081        160  third_party/halide/halide/src/Simplify_Internal.h:112 Halide::Internal::Simplify::ExprInfo::cast_to()
    @     0x5609a907dedd        160  third_party/halide/halide/src/Simplify_Exprs.cpp:14 Halide::Internal::Simplify::visit()
    @     0x5609a8e9632b        160  third_party/halide/halide/src/IRVisitor.h:170 Halide::Internal::VariadicVisitor<>::dispatch_expr<>()

The injection point is apparently a Halide merge that include the following changes:

7ca95d865 Expose BFloat in Python bindings (#8255) 7cf2951b0 Remove max size assert from Anderson2021 (#8253) a9b8fbf7c Rework the simplifier to use ConstantInterval for bounds (#8222) 35143d206 Mark host_dirty() and device_dirty() with no_discard. (#8248) 711dc88a3 Add HVX_v68 target to support Hexagon HVX v68. (#8232) 3ea47475e [xtensa] added support for sqrt_f16 (#8247) 33d5ba953 Fix saturating add matching in associativity checking (#8220) b5f5065c8 Add some EVAL_IN_LAMBDAs to Simplify_Sub.cpp (#8230) 8a316d1df [xtensa] Added vector load for two vectors for f16 and f32 (#8226)

testcase-5210573843529728.zip

abadams commented 3 months ago

Does this one repro outside of Google? The last six months or so of fuzzer failures found inside Google don't repro upstream, so I'm hesitant to even try this one. I think I'll just run fuzz_cse overnight instead.

abadams commented 3 months ago

(The failing assert was added in #8222)

steven-johnson commented 3 months ago

Does this one repro outside of Google?

Have not tried.

abadams commented 3 months ago

I have set up 8 processes to fuzz cse in the open source overnight. We'll see if they can find an equivalent failure.

steven-johnson commented 3 months ago

Any luck?

abadams commented 3 months ago

Yes, but the luck was amazingly bad. First there was a power spike + outage that killed the process, and now my work machine has a dead motherboard. When it boots (which is rare), the CPUs run at 250 MHz and dmesg spews errors.

It didn't find any failures before the outage either.

abadams commented 3 months ago

It doesn't repro with that .zip file on linux-bot-4. I'll leave fuzz_cse running on linux-bot-4 for a while just to see if it finds anything

abadams commented 3 months ago

No failures found overnight with 24 threads. Unless the fuzzing inside Google dedicates a lot more cycles to this, I don't think this bug exists in main.