Closed ghost closed 5 years ago
Mh, interesting find, thanks!
There is also a return
hook, which should be inserted before every return instruction. Maybe that is a quick workaround for the missing end
hook for your use case? What is your use case?
The difference and the exact semantics for end
hook vs. return
hook is not documented and unfortunately not even 100% clear to me. I think they somewhat overlap in functionality (i.e. both should be called on return instructions), but one is rather for "block scoping" (end hook, e.g., "When does the execution leave blocks?") vs. "What values does the function return?" (return hook).
I do not really understand this part:
Looking at the ends of nested if statements, it seems as if there may be some more misplaced instructions.
What do you mean by misplaced instruction? Hooks by Wasabi? Could you point them out in the instrumented program (ideally ~5 lines)? As a first check: is the instrumented program still valid wasm (according to WABT wasm-vaidate
)?
If you want to dig deeper or help me with fixing this (I won't have much time in the next 2 weeks unfortunately), you could
end
instructions here: https://github.com/danleh/wasabi/blob/master/src/instrument/add_hooks/mod.rs#L247) and see if there is anything surprising (although I am not sure the code is easy enough to understand).Thanks again for reporting, sorry that I don't have an immediate answer :)
Just to clarify: You mean there is a call 551
missing in this piece of (yet to be instrumented) code, somewhere in the lower half of the code that you posted
if ;; label = @4
return
I agree, and I think the bug must be somewhere in the instrumentation for return
instructions: https://github.com/danleh/wasabi/blob/master/src/instrument/add_hooks/mod.rs#L389
Maybe this code is wrong, and forgets to return the function block?: https://github.com/danleh/wasabi/blob/master/src/instrument/add_hooks/mod.rs#L408
Thank's for the hint. I actually tried the return hook and it seems that one has the same problem. After a quick look in the code it might be that Wasabi incorrectly assumes that it is processing dead code there - just a guess. For our use case it is actually okay to miss a few function returns but it would be nicer to catch them eventually.
From your explanation I see how Wasm code is meant to be instructed. However, it seems that is not exactly what it currently does.
It seems, I simply misunderstood the end hook. I was wondering about the many "end block" hooks before some of the return statements. With your explanation this makes sense. However, now I think there are a few more hooks missing in the code. I will highlight some below. The instructed code is valid Wasm and runs smoothly. This issue is only about the missing hooks.
I will try to reduce the test case to a few smaller ones.
Yes, this is what I was mainly concerned with.
if ;; label = @4
return
Instructed code with comments. I don't think I understand the internals of Wasabi in detail yet so apologies if I draw the wrong conclusions here.
(func (;798;) (type 29) (param i32 i32)
(local i32 i32 i32 i32)
i32.const 787
i32.const -1
call 548
block ;; label = @1
i32.const 787
i32.const 0
call 549
block ;; label = @2
i32.const 787
i32.const 1
call 549
local.get 0
i32.load
local.tee 3
if ;; label = @3
i32.const 787
i32.const 5
call 550
local.get 1
i32.const 15
i32.gt_u
local.set 5
local.get 0
i32.const 4
i32.add
local.tee 2
i32.load
i32.eqz
if ;; label = @4
i32.const 787
i32.const 16
call 550
local.get 5
if ;; label = @5
i32.const 787
i32.const 18
call 550
local.get 0
local.get 1
i32.const 1
i32.add
i32.const 16
local.get 0
i32.load offset=24
i32.const 0
i32.const 1783927
i32.const 305
call 2740
local.tee 2
i32.store
i32.const 0
local.set 4
i32.const 787
i32.const 44
i32.const 18
call 552
i32.const 787
i32.const 45
i32.const 16
call 552
i32.const 787
i32.const 102
i32.const 5
call 552
i32.const 787
i32.const 103
i32.const 1
call 554
>>> At this point the function block, two normal blocks, and three if-blocks are open. I see 4 end-hooks. Are two missing?
br 3 (;@2;)
else
local.get 0
i32.const 0
i32.store
i32.const 0
local.set 4
i32.const 0
local.set 2
>>> At this point the function block, two normal blocks, two if-blocks, and one else-block are open. It seems, that some end-hooks are missing before the break.
br 3 (;@2;)
i32.const 787
i32.const 44
i32.const 18
call 552
>>> Is the hook after the break statement dead code?
end
i32.const 787
i32.const 45
i32.const 16
call 552
end
local.get 5
i32.eqz
if ;; label = @4
i32.const 787
i32.const 48
call 550
local.get 0
i32.const 0
i32.store
i32.const 1
local.set 4
i32.const 0
local.set 2
i32.const 787
i32.const 57
i32.const 48
call 552
i32.const 787
i32.const 102
i32.const 5
call 552
i32.const 787
i32.const 103
i32.const 1
call 554
br 2 (;@2;)
i32.const 787
i32.const 57
i32.const 48
call 552
end
local.get 0
local.get 3
local.get 1
i32.const 1
i32.add
i32.const 16
local.get 0
i32.load offset=24
i32.const 0
i32.const 1783927
i32.const 285
call 2843
i32.store
local.get 2
local.get 1
i32.store
i32.const 787
i32.const 102
i32.const 5
call 552
i32.const 787
i32.const 103
i32.const 1
call 554
i32.const 787
i32.const 137
i32.const 0
call 554
i32.const 787
i32.const 138
call 551
return
else
local.get 1
i32.const 16
i32.lt_u
if ;; label = @4
return
else
local.get 0
local.get 1
i32.const 1
i32.add
i32.const 16
local.get 0
i32.load offset=24
i32.const 0
i32.const 1783927
i32.const 276
call 2740
local.tee 2
i32.store
i32.const 0
local.set 4
local.get 0
i32.const 4
i32.add
local.set 3
end
i32.const 787
i32.const 102
i32.const 5
call 552
end
i32.const 787
i32.const 103
i32.const 1
call 554
end
local.get 0
i32.const 4
i32.add
local.set 5
local.get 2
if (result i32) ;; label = @2
i32.const 787
i32.const 109
call 550
local.get 2
i32.const 787
i32.const 111
i32.const 109
call 552
else
i32.const 787
i32.const 111
i32.const 109
call 555
local.get 5
i32.const 787
i32.const 113
i32.const 111
i32.const 109
call 556
end
local.get 3
local.get 0
i32.load offset=20
i32.const 1
i32.add
call 34070
drop
local.get 4
if ;; label = @2
i32.const 787
i32.const 122
call 550
local.get 3
local.get 0
i32.load offset=24
call 2698
i32.const 787
i32.const 127
i32.const 122
call 552
end
local.get 0
i32.load
i32.eqz
if ;; label = @2
i32.const 787
i32.const 131
call 550
i32.const 787
i32.const 133
i32.const 131
call 552
i32.const 787
i32.const 137
i32.const 0
call 554
i32.const 787
i32.const 138
call 551
return
i32.const 787
i32.const 133
i32.const 131
call 552
end
local.get 0
local.get 1
i32.store offset=4
i32.const 787
i32.const 137
i32.const 0
call 554
end
i32.const 787
i32.const 138
call 551)
Wow, that is some excellent investigation! I see what you mean by missing end hooks and I have to agree. I will have to defer this further to next week, but my next steps for debugging would be (more as a reminder for myself, but feel free to contribute if this blocks you):
It seems the problem is with the detection of unreachable code. I understand the idea with the variable unreachable
but it seems incorrect to me in some cases. The problem of this issue arises easily when an else
statement follows a return
statement. In this case unreachable
is set but it incorrectly remains after the else
statement. The following test case will reproduce the problem:
(module
(func (export "main")
(if
(i32.lt_s
(i32.const 42)
(i32.const 42)
)
(then
return
)
(else
return
)
)
)
)
The problem seems easy to fix. However, when trying to add special treatment for the else
case I suspected more general problems with the stack consistency. If unreachable
is set and therefore we skip an end
statement we might unintentionally leave elements on the stacks. It seems to me that they can easily become inconsistent that way. If I understand correctly, both stacks (block_stack
and type_stack
) should always have the same size and they should be empty when completing the Wasm module. Imo assertions wrt the stack size would be very valuable, especially on module completion. However, I don't understand the code well enough to do so and I could not get the cargo tests
running. This is the first time I am touching Rust btw, so forgive me if I just don't know how to do it.
The else
statement seems particularly tricky wrt unreachability: in some cases only the block-closing hooks are unreachable, in others also the block-opening hooks are.
I was also wondering if adding instructions for unreachable code something to be concerned about. It seems unlikely to have dead code of that sort in Wasm modules. I might be wrong though.
Here are some more test cases to download.
Thanks, that's an excellent example. You are correct, the code around the else is wrongly instrumented. For reference, Wasabi produces:
(module
(type (;0;) (func (param i32 i32)))
(type (;1;) (func (param i32 i32 i32)))
(type (;2;) (func (param i32 i32 i32 i32 i32)))
(type (;3;) (func))
(import "__wasabi_hooks" "begin_function" (func (;0;) (type 0)))
(import "__wasabi_hooks" "i32.const" (func (;1;) (type 1)))
(import "__wasabi_hooks" "i32.lt_s" (func (;2;) (type 2)))
(import "__wasabi_hooks" "if" (func (;3;) (type 1)))
(import "__wasabi_hooks" "begin_if" (func (;4;) (type 0)))
(import "__wasabi_hooks" "return" (func (;5;) (type 0)))
(import "__wasabi_hooks" "end_if" (func (;6;) (type 1)))
(import "__wasabi_hooks" "end_function" (func (;7;) (type 0)))
(func (;8;) (type 3)
(local i32 i32 i32 i32)
i32.const 0
i32.const -1
call 0
i32.const 42
i32.const 0
i32.const 0
i32.const 42
call 1
i32.const 42
i32.const 0
i32.const 1
i32.const 42
call 1
set_local 1
tee_local 0
get_local 1
i32.lt_s
tee_local 2
i32.const 0
i32.const 2
get_local 0
get_local 1
get_local 2
call 2
tee_local 3
i32.const 0
i32.const 3
get_local 3
call 3
if ;; label = @1
i32.const 0
i32.const 3
call 4
i32.const 0
i32.const 4
call 5
i32.const 0
i32.const 7
i32.const 3
call 6
i32.const 0
i32.const 8
call 7
return
else
return
i32.const 0
i32.const 7
i32.const 3
call 6
end
i32.const 0
i32.const -1
call 5
i32.const 0
i32.const 8
call 7)
(global (;0;) (mut i32) (i32.const 1))
(export "main" (func 8)))
I will look into the unreachable code detection. I assume you are right: That part of Wasabi was very quick&dirty to begin with, so entirely plausible that it has lingering bugs.
I would rather fix instead of fully remove the unreachable code detection though. The problem came up when I tested Wasabi against the official Wasm test suite. Those examples contain unreachable code and even real-world compilers (emscripten etc.) emit it every once in a while. Without the detection, Wasabi produces type incorrect code, which cannot be validated and thus not executed.
Out of interest though: without unreachable code handling (that is, in your own branch/the PR that you submitted), does it work correctly? If so, would that be a workaround for you, for now? (As long as you don't come across unreachable code in your examples.)
You're right, it does not solve the problem on the actual benchmark. I had tested it only with the small test cases. As we use Wasabi only to find similar functions, we can actually live with the bug for now.
@frgossen-g I think this should be fixed now, thanks again for reporting and the help!
return
inside the else
branch is correctly instrumented. (I.e., the issue should be fixed.)wasm-validate
. (I.e., the fix should not have regressed anybody.)else
instruction (which closes the current block and thus should "end" the unreachable region.)If you can test again, that would be great. Afterwards I would close this.
Thanks for the fix!
Closing, let me know if it's not fixed completely.
In the following case, Wasabi misses to place the {end_function} instruction call (
call 551
). I would expect that before every return statement, one is missing.Looking at the ends of nested if statements, it seems as if there may be some more misplaced instructions. However, I do not understand this in detail so it might as well be wrong about that.
Original code:
Instructed code (--hooks=begin,end, 11 additional imports):
Wasabi imports: