WebAssembly / binaryen

Optimizer and compiler/toolchain library for WebAssembly
Apache License 2.0
7.49k stars 741 forks source link

wasm-opt error when closed-world option enabled #7015

Open hondogo opened 5 days ago

hondogo commented 5 days ago

Binaryen version: 119

When executing wasm-opt with enabled option closed-world got: Fatal: Internal GlobalTypeRewriter build error: Heap type has an invalid supertype at index 1

Command that execute: wasm-opt --enable-gc --enable-reference-types --enable-exception-handling --enable-bulk-memory --enable-nontrapping-float-to-int --closed-world --type-ssa input.wasm -O3 -o optimized.wasm

Execution is successful when performed without option closed-world.

In attachment there is input.zip file (archive with input.wasm) for which this command was executed. input.zip

kripken commented 5 days ago

Reduced testcase:

(module
  (rec
    (type $parent (sub (struct (field (ref func)))))
    (type $child1 (sub $parent (struct (field (ref func)))))
    (type $child2 (sub $parent (struct (field (ref func)))))

    (type $func (func (param $child1 (ref $child1)) (param $child2 (ref $child2))))
  )

  (func $func (export "func") (type $func) (param $child1 (ref $child1)) (param $child2 (ref $child2))
    (drop
      (struct.new $parent
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child1
        (ref.func $func)
      )
    )
    (drop
      (struct.new $child2
        (ref.func $func)
      )
    )
    (drop (struct.get $child2 0 (local.get $child2)))
  )
)

What happens here is that the entire rec group is public because of that function, which is exported, and which then makes the entire rec group public. GTO is not noticing that it is public and optimizing, which leads to making those struct.news all by new_defaults, since it wants to remove the fields. But one field is non-nullable, which errors, as the field is not actually removed - TypeUpdater does notice the type is public.

We can make GTO avoid public types, and I'll do that as a bugfix. However, separately, on this testcase that would mean GTO will do nothing at all, as there is just one huge rec group, which is a big loss. I was hoping that --minimize-rec-groups would help here, but it doesn't - running it on that reduced testcase does nothing. I was hoping it would split out the type $func which is not recursive with anything - @tlively do you know why it isn't splitting there?

kripken commented 5 days ago

7017 resolves one issue here, but I also see an error on SignaturePruning.

@hondogo Separately from fixing these bugs, if you can avoid mixing the types of exported functions in the big rec group, then you will get much better results (as it will keep the big rec group private, and modifiable).

hondogo commented 5 days ago

@kripken Thanks for advise! This code was generated by Kotlin compiler. I've forwarded your advise to Kotlin team.

tlively commented 5 days ago

--minimize-rec-groups does not modify public types, so it can't do anything here.

kripken commented 5 days ago

@tlively Oh, right, thanks. That can't help here then.

I guess in the long term if we add a way to mark a rec group as private then we could optimize using that (so a private rec group might contain public types, which we would not modify, but other ones we could).

tlively commented 5 days ago

Maybe, but even then modifying the types would change the module's external types, so it still wouldn't be safe. The real solution is for the producer to avoid this situation when originally producing the module.

Note, however, that Kotlin is currently forced into this situation where all the types are in the same rec group as the public types because it is a hack to work around failing our closed world validation. The real real solution is for us to make the changes described in #6965.

kripken commented 5 days ago

Wait, how does putting all the types in a single rec group work as a hack for a closed world validation issue?

tlively commented 5 days ago

The validation allows all types in the same rec group as a public function type to be public without complaint. This might be a bug, but I would say the existence of the validation is a bigger bug. Is a bug in a feature that is a bug a feature?

kripken commented 5 days ago

Is a bug in a feature that is a bug a feature?

:rofl:

Yeah, I guess that validation is kind of weird. Removing it as part of #6965 sgtm. In fact, if it is blocking people right now, removing the validation by itself seems fine?

tlively commented 5 days ago

I'd be worried about getting an avalanche of fuzz bugs, but other than that, I agree that removing the validation as a first step would be nice.

kripken commented 5 days ago

Sounds good, see #7019 for removing that validation.

bashor commented 5 days ago

@kripken Could you please hint to me which declarations in the original input.wasm led to the bug? Could you please make a release with a fix?

kripken commented 5 days ago

@bashor One fix landed so far, another is open in https://github.com/WebAssembly/binaryen/pull/7018, and there is at least one more error after that lands that I haven't diagnosed yet. I can make a release when they all land.

I don't think there is anything to do on your side here. This code is just hitting some Binaryen bugs because we didn't have much real-world content that mixed public and private types in closed-world mode (e.g. Java uses closed-world but doesn't mix, and some other users mix but don't use closed-world). We'll just fix those bugs in Binaryen.

However, if you can avoid mixing public and private types in a single rec group, that will help. If closed-world validation errors were what forced you to do that, then https://github.com/WebAssembly/binaryen/pull/7019 should help once it lands.

By "mixed public and private types in a single rec group" I mean things like the example in the second comment. There the function type could have been in its own singleton rec group outside, which would have left all the private types together, where they could be optimized.

kripken commented 4 days ago

With #7022, #7018, #7019 (all not yet landed), the testcase here finally passed in closed world (even including the extra validation checks of pass-debug mode).

To get an idea of the benefit of closed world here, I ran -O3 --gufa -O3 --gufa -O3 -tnh with and without it. Code size is 6% smaller with closed world. (There may also be speed benefits.)

kripken commented 4 days ago

I also did an experiment where I

As a result, there are practically no public types remaining.

When I then optimize with vs without --closed-world, that flag makes the binary 48% smaller - almost half the size.

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

tlively commented 4 days ago

@bashor Based on that I think there can be large benefits to emitting as few public types as possible in the Kotlin compiler.

In particular, this can be accomplished by using anyref instead of concrete reference types in the signatures of exported functions.