bytecodealliance / wasmtime

A fast and secure runtime for WebAssembly
https://wasmtime.dev/
Apache License 2.0
15.13k stars 1.27k forks source link

fuzz: lacking coverage of sinkable loads in ISLE-generated x64 lowerings #6293

Open abrown opened 1 year ago

abrown commented 1 year ago

@alexcrichton sent me a coverage report for the ISLE-generated lowerings for x64. I noticed by searching for "sinkable_load" that various operations (FP?) are not using the case where the load is sunk. In the Wasmtime meeting, @alexcrichton pointed out that the fact that we don't know whether the address is aligned could be preventing load-sinking here, but recall (here) that VEX- and EVEX-encoded instructions do not have the alignment restriction. Since @alexcrichton has been adding VEX-encoded versions of many instructions, I would think that we should be able to load-sink in that case (i.e., "can load sink MINUS alignment restriction" AND "have AVX").

I don't know what to do with this issue exactly, but @fitzgen thought I should record it as one. As mentioned in the Wasmtime meeting, it would be great to have better ways to find coverage gaps. This is at least one x64 lowering gap that I noticed (there are likely others) so here's an issue!

abrown commented 1 year ago

cc: @cfallin, @jameysharp, @elliottt

abrown commented 1 year ago

Here's another dimension to this issue: even when the sinkable loads are used, they are used very infrequently. Why is that?

E.g., here:

line    count
12895   7.98k                                   let v38 = C::unpack_value_array_2(ctx, v37);
12896   7.98k                                   let v58 = &C::sinkable_load(ctx, v38.0);
12897   7.98k                                   if let Some(v59) = v58 {
12898   2                                       let v1045 = constructor_put_in_xmm(ctx, v38.1);
12899   2                                       let v1056 = &constructor_sink_load_to_xmm_mem(ctx, v59);
12900   2                                       let v1081 = constructor_x64_mulss(ctx, v1045, v1056);
12901   2                                       let v1082 = constructor_output_xmm(ctx, v1081);
12902   2                                       // Rule at src/isa/x64/lower.isle line 2189.
12903   2                                       return Some(v1082);
12904   7.97k                                   }

The sinkable load is only used a very low percentage of times.

alexcrichton commented 1 year ago

As for the frequency issue, I think it may be because it's just so rare to generate the precise shape of load-then-op. That's a pretty wishy-washy excuse though and I wouldn't be confident in it.

As for lack of sinkable loads in some blocks this is one example which is f32x4.add doesn't do anything with sinkable loads. The conditions for this are:

Even with all that in place it looks like f32x4.add alone was only translated 101 times meaning that if we go from the previous 2/7980 hit rate then we basically need to drastically increase the f32x4.add frequence to get it to hit the sinkable load path.

One aspect here I think is that SIMD is conditionally enabled, whereas f32.add is always enabled, which may end up causing simd to be a bit rarer.