Missing optimization in Wasm for `x < 0 || x > constant`

osa1 commented 2 weeks ago

The attached .wasm file has this code at offset 95044:

 95044|        block $label3
 95046|          block $label2
 95048|            local.get $var6
 95050|            i64.const 0
 95052|            i64.lt_s                     ;; $var6 < 0 (signed)
 95053|            br_if $label2                ;; Jump to error branch
 95055|            local.get $var6
 95057|            i64.const 255
 95060|            i64.gt_s                     ;; $var6 > 255 (signed)
 95061|            i32.eqz                      ;; Negate the condition
 95062|            br_if $label3                ;; Jump to OK branch
 95064|          end $label2                    ;; Error branch
 95065|          call $FormatException
 95068|          call $Error._throwWithCurrentStackTrace
 95071|          unreachable
 95072|        end $label3                      ;; OK branch

This is a direct translation of Dart code:

if (value < 0 || value > 255) throw ...

When optimized with wasm-opt 765c61445 (current main branch) using flags

--all-features
--closed-world
--traps-never-happen
--type-unfinalizing
-Os
--type-ssa
--gufa
-Os
--type-merging
-Os
--type-finalizing

wasm-opt generates this:

local.tee $var0
i64.const 0
i64.ge_s                        ;; var0 >= 0 (signed)
local.get $var0
i64.const 255
i64.le_s                        ;; var0 =< 255 (signed)
i32.and
if                              ;; OK branch
  ...
else                            ;; Error branch
  ...
end

Here var0 >= 0 && var0 <= 255 part could be optimized to a single unsigned comparison

local.get $var0
i64.const 255
i64.le_u
if
  ...                           ;; OK branch
else
  ...                           ;; Error branch
end

Interestingly on a different program with the exact same Dart expression, wasm-opt generates different instructions, but with the same missing optimization:

local.get $var9
i64.const 0
i64.lt_s                ;; $var9 < 0 (signed)
br_if $label4
local.get $var9
i64.const 255
i64.gt_s                ;; $var9 > 255 (signed)
i32.eqz
br_if $label5

For this second program, an older wasm-opt (d844d2e77) generates code with a more obvious missing optimization:

local.tee $var5
i64.const 0
i64.ge_u                   ;; var5 >= 0 (unsigned)
local.get $var5
i64.const 255
i64.le_u                   ;; var5 <= 255 (unsigned)
i32.and
i32.eqz

Here ge_u against 0 will always evaluate to 1, so the code can be simplified to the same optimized code as above.

test.wasm.zip

kripken commented 2 weeks ago

Makes sense, those two-part comparisons look like they should be optimized. However in general we know we are missing a great deal of such patterns, and our plan has been to call out to LLVM for this in the long term. If you think these particular patterns are common and urgent to get in then I can look into it, but if this is more of a general issue then maybe instead it would make sense for us to re-prioritize the LLVM work. What do you think?

OTOH the specific pattern of x >= 0 (unsigned) looks like it is already implemented. I don't see (i64.ge_u (X) (i64.const 0)) in the output of wasm-opt test.unopt.wasm -O3 -all -tnh --closed-world, and I also verified with

(module
  (func $test (export "test") (param $x i32) (result i32)
    (i32.ge_u
      (local.get $x)
      (i32.const 0)
    )
  )
)

$ bin/wasm-opt a.wat -O3 --print
(module
 (type $0 (func (param i32) (result i32)))
 (export "test" (func $test))
 (func $test (param $0 i32) (result i32)
  (i32.const 1)
 )
)

Is it possible you need to run another cycle of optimizations, if you still see that somehow?

mkustermann commented 1 week ago

If you think these particular patterns are common and urgent to get in then I can look into it, but if this is more of a general issue then maybe instead it would make sense for us to re-prioritize the LLVM work. What do you think?

Would the idea be binaryen calling out to LLVM? What would that do to compile-times of binaryen?

(Somewhat offtopic) where we could also benefit:

Dart has only one integer type, namely int. It's a 64-bit int wrap-around type. So the code we emit is all 64-bit integer arithmetic & bitwise operations. Though very often such 64-bit int code could be done in 32-bit arithmetic.

The conversions between 32-bit and 64-bit come up especially when e.g. indexing arrays, getting length of arrays, ... So for example a loop over a wasm array will be using a i64 loop variable that is incrementing until end of array (which we get via array.len; i64.extend_i32_u) and the body of the loop will use i32.wrap_i64; array.get to index - where the loop induction variable could just be i32 the entire time (avoiding all i64.extend_i32_u, i32.wrap_i64).

kripken commented 1 week ago

@mkustermann

Would the idea be binaryen calling out to LLVM? What would that do to compile-times of binaryen?

Yes, that's the idea. We'd be using LLVM as a library so I don't think our compile times would be affected much (but users that want this optional feature would need to have a build of LLVM, either download it or build it).

the code we emit is all 64-bit integer arithmetic & bitwise operations. Though very often such 64-bit int code could be done in 32-bit arithmetic.

Interesting, that does sound like an opportunity to optimize. Perhaps open a new issue specifically for that? If you can provide some complete code samples that show the situation that would be helpful too.

WebAssembly / binaryen

Missing optimization in Wasm for `x < 0 || x > constant` #6685