bytecodealliance / wasm-micro-runtime

WebAssembly Micro Runtime (WAMR)
Apache License 2.0
4.93k stars 623 forks source link

CI: "spec test on nuttx" riscv job is failing frequently #3776

Open yamt opened 1 month ago

yamt commented 1 month ago

it's often "const" of "call_indirect" test failing it seems. but it doesn't seem very consistent. failure logs i looked at were all ILP32F.

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10764321632/job/29847058408

============> run const failed with a non-zero return code 101

Running: /__w/wasm-micro-runtime/wasm-micro-runtime/apps/interpreters/wamr/wamr/tests/wamr-test-suites/workspace/../../../wamr-compiler/build/wamrc --target=riscv32 --target-abi=ilp32f --cpu=generic-rv32 --cpu-features=+m,+a,+c,+f --disable-simd --disable-llvm-lto --bounds-checks=1 -o /tmp/tmppauwm0qx.aot /tmp/tmpp6zgci1d.wasm
Started with:
Create AoT compiler with:
  target:        riscv32
  target cpu:    generic-rv32
  target triple: riscv32-pc-linux-ilp32f
  cpu features:  +m,+a,+c,+f
  opt level:     3
  size level:    3
  output format: AoT file

Starting interpreter for module '/tmp/tmppauwm0qx.aot'
Running: qemu-system-riscv32 -semihosting -M virt,aclint=on -cpu rv32 -smp 1 -nographic -bios none -kernel /__w/wasm-micro-runtime/wasm-micro-runtime/nuttx/nuttx
Started with:
iwasm --heap-size=0 --repl --stack-size=1 /tmp/tmppauwm0qx.aot

Testing(return) f  = -1.797693e+308:f64
THE FINAL EXCEPTION IS Failed:
 Result 0 incorrect:
 expected: '-1.797693e+308:f64'
  got: '-inf:f64'

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10755545345/job/29827427650

============> run call_indirect failed with a non-zero return code 101

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10746972440/job/29808995646

============> run call_indirect failed with a non-zero return code 101

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10730191223/job/29758313644

============> run call_indirect failed with a non-zero return code 101

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10711568587/job/29710199664

============> run call_indirect failed with a non-zero return code 101

https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10692765583/job/29641885907

============> run call_indirect failed with a non-zero return code 101
yamt commented 1 month ago

@no1wudi any idea?

lum1n0us commented 1 month ago

I though @wenyongh is working on it too. https://github.com/bytecodealliance/wasm-micro-runtime/pull/3771

wenyongh commented 1 month ago

I have no good idea yet, at first I think it may be caused by the timeout to wait for compiling wasm to aot is a little small, but it doesn't work even when I increases it. And it may report several errors: (1) THE FINAL EXCEPTION IS compile wasm to aot failed (2) THE FINAL EXCEPTION IS argument of type 'NoneType' is not iterable (3) Testing(return) f = -1.797693e+308:f64 THE FINAL EXCEPTION IS Failed: Result 0 incorrect: expected: '-1.797693e+308:f64' got: '-inf:f64'

It fails frequently, I have to re-run it manually many times.

lum1n0us commented 1 month ago

🤔 I am thinking about "sequential" instead of "parallel" by removing -P from https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10764321632/workflow#L333

no1wudi commented 1 month ago

🤔 I am thinking about "sequential" instead of "parallel" by removing -P from https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10764321632/workflow#L333

Shutting down parallel testing may reduce or even prevent the problem from continuing to occur, but it may lead to excessively long testing times; perhaps we can give it a try.

no1wudi commented 1 month ago

Speed of "sequential" test seems acceptable: #3780

Maybe we can try it fist

@wenyongh @lum1n0us @yamt

yamt commented 1 month ago

Speed of "sequential" test seems acceptable: #3780

Maybe we can try it fist

@wenyongh @lum1n0us @yamt

it's still failing. https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10775274630/job/29879549182

wenyongh commented 1 month ago

It's more like an issue related to riscv32_ilp32f, the CI ran failed most on config "riscv32_ilp32f, fp, -t aot -X" and "riscv32_ilp32f, fp, -t aot": https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10775274630/job/29906929175#step:25:7726 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10775274630/job/29906623580 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10783438239/job/29906934331 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10783852668/job/29906610576 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10784036686/job/29907065645

wenyongh commented 1 month ago

How about re-opening #3777 and merging it?

lum1n0us commented 1 month ago

I agree. It has blocked several PRs.

yamt commented 1 month ago

How about re-opening #3777 and merging it?

re-opened and rebased.

yamt commented 1 month ago

It's more like an issue related to riscv32_ilp32f, the CI ran failed most on config "riscv32_ilp32f, fp, -t aot -X" and "riscv32_ilp32f, fp, -t aot": https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10775274630/job/29906929175#step:25:7726 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10775274630/job/29906623580 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10783438239/job/29906934331 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10783852668/job/29906610576 https://github.com/bytecodealliance/wasm-micro-runtime/actions/runs/10784036686/job/29907065645

at least riscv64 was failing with similar symptoms too. i agree riscv32_ilp32f was failing more frequently though.