Open hondogo opened 5 days ago
Reduced testcase:
(module
(rec
(type $parent (sub (struct (field (ref func)))))
(type $child1 (sub $parent (struct (field (ref func)))))
(type $child2 (sub $parent (struct (field (ref func)))))
(type $func (func (param $child1 (ref $child1)) (param $child2 (ref $child2))))
)
(func $func (export "func") (type $func) (param $child1 (ref $child1)) (param $child2 (ref $child2))
(drop
(struct.new $parent
(ref.func $func)
)
)
(drop
(struct.new $child1
(ref.func $func)
)
)
(drop
(struct.new $child2
(ref.func $func)
)
)
(drop (struct.get $child2 0 (local.get $child2)))
)
)
What happens here is that the entire rec group is public because of that function, which is exported, and which then makes the entire rec group public. GTO is not noticing that it is public and optimizing, which leads to making those struct.new
s all by new_default
s, since it wants to remove the fields. But one field is non-nullable, which errors, as the field is not actually removed - TypeUpdater does notice the type is public.
We can make GTO avoid public types, and I'll do that as a bugfix. However, separately, on this testcase that would mean GTO will do nothing at all, as there is just one huge rec group, which is a big loss. I was hoping that --minimize-rec-groups
would help here, but it doesn't - running it on that reduced testcase does nothing. I was hoping it would split out the type $func
which is not recursive with anything - @tlively do you know why it isn't splitting there?
@hondogo Separately from fixing these bugs, if you can avoid mixing the types of exported functions in the big rec group, then you will get much better results (as it will keep the big rec group private, and modifiable).
@kripken Thanks for advise! This code was generated by Kotlin compiler. I've forwarded your advise to Kotlin team.
--minimize-rec-groups
does not modify public types, so it can't do anything here.
@tlively Oh, right, thanks. That can't help here then.
I guess in the long term if we add a way to mark a rec group as private then we could optimize using that (so a private rec group might contain public types, which we would not modify, but other ones we could).
Maybe, but even then modifying the types would change the module's external types, so it still wouldn't be safe. The real solution is for the producer to avoid this situation when originally producing the module.
Note, however, that Kotlin is currently forced into this situation where all the types are in the same rec group as the public types because it is a hack to work around failing our closed world validation. The real real solution is for us to make the changes described in #6965.
Wait, how does putting all the types in a single rec group work as a hack for a closed world validation issue?
The validation allows all types in the same rec group as a public function type to be public without complaint. This might be a bug, but I would say the existence of the validation is a bigger bug. Is a bug in a feature that is a bug a feature?
Is a bug in a feature that is a bug a feature?
:rofl:
Yeah, I guess that validation is kind of weird. Removing it as part of #6965 sgtm. In fact, if it is blocking people right now, removing the validation by itself seems fine?
I'd be worried about getting an avalanche of fuzz bugs, but other than that, I agree that removing the validation as a first step would be nice.
Sounds good, see #7019 for removing that validation.
@kripken
Could you please hint to me which declarations in the original input.wasm
led to the bug?
Could you please make a release with a fix?
@bashor One fix landed so far, another is open in https://github.com/WebAssembly/binaryen/pull/7018, and there is at least one more error after that lands that I haven't diagnosed yet. I can make a release when they all land.
I don't think there is anything to do on your side here. This code is just hitting some Binaryen bugs because we didn't have much real-world content that mixed public and private types in closed-world mode (e.g. Java uses closed-world but doesn't mix, and some other users mix but don't use closed-world). We'll just fix those bugs in Binaryen.
However, if you can avoid mixing public and private types in a single rec group, that will help. If closed-world validation errors were what forced you to do that, then https://github.com/WebAssembly/binaryen/pull/7019 should help once it lands.
By "mixed public and private types in a single rec group" I mean things like the example in the second comment. There the function type could have been in its own singleton rec group outside, which would have left all the private types together, where they could be optimized.
With #7022, #7018, #7019 (all not yet landed), the testcase here finally passed in closed world (even including the extra validation checks of pass-debug mode).
To get an idea of the benefit of closed world here, I ran -O3 --gufa -O3 --gufa -O3 -tnh
with and without it. Code size is 6% smaller with closed world. (There may also be speed benefits.)
I also did an experiment where I
_initialize
and startUnitTests
out of the main rec group, andmain
and __callFunction_(()->Unit)
(just moving their types wouldn't be enough, as their signatures contain other types).As a result, there are practically no public types remaining.
When I then optimize with vs without --closed-world,
that flag makes the binary 48% smaller - almost half the size.
@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.
@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.
In particular, this can be accomplished by using anyref
instead of concrete reference types in the signatures of exported functions.
Binaryen version: 119
When executing wasm-opt with enabled option closed-world got:
Fatal: Internal GlobalTypeRewriter build error: Heap type has an invalid supertype at index 1
Command that execute:
wasm-opt --enable-gc --enable-reference-types --enable-exception-handling --enable-bulk-memory --enable-nontrapping-float-to-int --closed-world --type-ssa input.wasm -O3 -o optimized.wasm
Execution is successful when performed without option closed-world.
In attachment there is input.zip file (archive with input.wasm) for which this command was executed. input.zip