WebAssembly / design

WebAssembly Design Documents
http://webassembly.org
Apache License 2.0
11.4k stars 694 forks source link

Function body maximum size #1138

Open Cellule opened 6 years ago

Cellule commented 6 years ago

Among the browser vendors we agreed upon a limit of function body size to allow in webassembly module of 128KB.

However, the benchmark Godot has a function of 151975Bytes or 148KB (function 2187).

For some reason Chrome and Firefox allow it. Now we end up in Edge where we invalidate this module.

It makes me think that we didn't understand the limit correctly. According to v8/module-decoder.cc, it looks like it is checking the size in bytes.

Can someone from the other browser vendors help me understand this ?

lukewagner commented 6 years ago

D'oh, this particular limit check wasn't factored into the shared validation code and thus was only being checked by WebAssembly.validate in FF. That's an awfully big function; I'll see if perhaps Godot isn't enabling optimizations in their build. But if it is just some giant interpreter loop or something, perhaps this is evidence we've set the limit too low.

lukewagner commented 6 years ago

Also it appears that function index 21555 is even bigger, with 219504 bytes.

binji commented 6 years ago

It makes me think that we didn't understand the limit correctly. According to v8/module-decoder.cc, it looks like it is checking the size in bytes.

Unfortunately, it looks like DecodeWasmFunction is only used by SyncDecodeWasmFunction which is only used by a unittest. The limit is also used in the streaming decoder, but it's not clear why it's not erroring out there (likely because the streaming decoder is not used here). I filed a v8 bug: https://crbug.com/v8/6959.

jfbastien commented 6 years ago

FWIW we don't apply all the limits because some didn't make sense to our implementation. The ones we apply are here: https://github.com/WebKit/webkit/blob/master/Source/JavaScriptCore/wasm/WasmLimits.h

In particular, I dropped function size because it's a ridiculously low limit which basic wasm programs exceed (e.g. the benchmarks from the PLDI paper trips it).

Cellule commented 6 years ago

Do we want to revisit the limits that we use and make them part of the spec (either js or web spec)? I agree that 128kb is too low.

littledan commented 6 years ago

In general, if Wasm programs should work consistently across browsers, having any sort of restrictions be written in the specification and with conformance tests seems like a good idea to me. Otherwise, there's a natural "race to the bottom" (or top, here?) where browsers try to be compatible with each other as much as possible and permit what all the other ones permit. This sort of dynamic is responsible for a lot of the mess that we have in JS and the rest of the web platform.

I don't understand the details of this particular limit; is there any past discussion that was written down anywhere about it?

rossberg commented 6 years ago

@littledan, a race to the top is usually a good thing when it comes to arbitrary limits. Every language implementation has tons of implicit limitations, almost no language even documents their possibility (C is the only exception that comes to mind, and only for some hand-selected cases). Picking concrete numbers on the spec level makes no sense in most cases, because the limits are not static but depend on many complex platform-dependent and implementation-specific variables.

lukewagner commented 6 years ago

If it doesn't make sense in the spec, documenting the limits somewhere for toolchains to refer to makes sense. Maybe a new .md in webassembly/tool-conventions?

Anyhow yeah, sounds like 128kb is way too low. I think having a chosen limit is better than each engine having its own implicit limit unless we think all engines will be able to handle functions up to the maxModuleSize (1gb). I doubt that, especially given the subtle limits that show up like the 32mb local jump range on ARM. How about 1mb?

binji commented 6 years ago

It might be nice to have some more information about the largest functions we've seen before making this choice.

lukewagner commented 6 years ago

@kripken What are the biggest you've ever seen? Alon was telling me he was thinking of bringing the "outlining" feature (which splits up crazy-big functions) to wasm. With this, we wouldn't necessarily need to accommodate the biggest possible functions.

kripken commented 6 years ago

I've been thinking about this, but I don't yet have a good idea for something like outlining in wasm (the asm.js approach was quite hackish and we had a bunch of bugs there). I may end up just emitting a warning in the meantime.

Size-wise, I don't have good data on this, none of the biggest ones were recent, and I measured them as lines in the text format anyhow.

Could we do 64MB? I've never seen a wasm module that big, let alone a single function, so it seems like a safe bound. Or is there a benefit to a lower value? (I'm not sure why we need a bound at all so I'm missing some background I guess.)

lukewagner commented 6 years ago

Assuming there is a small factor expansion between wasm bytecode and machine code (I've seen 2-4x), and given ARM(32,64)'s 32mb local-jump limit, I'd be surprised if engines didn't fail in practice well before 32mb. It's possible of course to handle >32mb functions if you design for it, but I expect not all engines do. Happy to hear from other engines though.

RyanLamansky commented 6 years ago

One engine's data point: my .NET-based implementation inherits all of its limitations from the .NET framework itself, which doesn't seem to have any hard limit on function size (at least none that I've heard of or have been able to find after some quick searches).

titzer commented 6 years ago

I've fixed the V8 bug, and we now enforce the function size limit.

titzer commented 6 years ago

I still think it is a good idea that engines have a "gentleman's agreement" to enforce a static size limit for functions. I'm OK bumping up the 128k limit, but I think we should probably pick a reasonable one before the race to the top picks up :)

lukewagner commented 6 years ago

1mb?

kripken commented 6 years ago

@titzer Perhaps it's best not to apply the 128K limit right now? It will break Godot and likely other things. This will cause confusion and frustration for users.

flagxor commented 6 years ago

We've seen as high as 1.5MB, for instance this: https://bugs.chromium.org/p/chromium/issues/detail?id=769607

MikeHolman commented 6 years ago

If we're already seeing pathological cases of 1.5mb, then maybe we should bump all the way up to 10mb?

@kripken For now though, do you know how Godot is being built/is there any way to compile it smaller? Edge just shipped, and this isn't a reliability issue for existing production websites so we're going to be stuck with 128kb limit until next Windows release.

kripken commented 6 years ago

@MikeHolman: When is the next Windows release?

We can try to figure out a solution for Godot, but we may need to do something more radical if we can't expect a browser update soon. Because we've already seen other examples from other codebases like the one @flagxor mentioned, and also there may not be a practical solution in some cases, if the code is from a code generator, or something else that can't be worked around by preventing inlining.

In asm.js we experimented with outlining/function splitting. Overall it was messy and buggy, so I'd rather not repeat that, but maybe we have no choice.

MikeHolman commented 6 years ago

I expect it will be in about 6 months, so I think a short term targeted workaround will be sufficient.

lukewagner commented 6 years ago

@flagxor Oh hah, from your comments, looks like that 1.5mb was a data section expressed as one big function.

@MikeHolman I've emailed the Godot author (https://github.com/eska014/godot/tree/wasm-benchmark) asking if all optimizations are enabled and, if so, if the function can be artificially split somehow.

lukewagner commented 6 years ago

@MikeHolman That being said, I wouldn't be surprised if this broke for various uses in the wild over the next 6 months so, since it's such a trivial fix, it may be worth pushing to uplift the fix.

lukewagner commented 6 years ago

With a worst-case 4x bytecode-to-machine-code expansion, a 10mb function could hit ARM 32mb limits; so to be a bit conservative, could we do something smaller than 10mb?

MikeHolman commented 6 years ago

Windows is conservative about servicing. Even though it is a small fix, it needs to represent a significant reliability issue or a security issue. I'm not going to be able to convince them that this super new feature has enough web presence and hits the pathological case often enough to meet that bar.

I'm hesitant to go very much lower than 10mb, since we're still in early days and already have an example less than an order of magnitude away. Is 7mb a reasonable compromise?

lukewagner commented 6 years ago

Ah, makes sense. 7mb sgtm

littledan commented 6 years ago

If there's a new limit, should it be included in some specification and shared tests?

kripken commented 6 years ago

@MikeHolman 6 months sounds like a long time, so I guess we should look into function splitting on the toolchain side even if godot has a manual workaround - thoughts?

Regardless it's interesting to know if they have a workaround. @lukewagner btw, fwiw that wasm looks optimized to me.

lukewagner commented 6 years ago

@littledan I don't know about whether it belongs in the specifications... maybe so as an appendix. Above I suggested the tool-conventions repo.

littledan commented 6 years ago

@lukewagner Appendix seems fine to me. The important thing would be that implementations actually converge on something shared (if we want to have a shared limit like this; maybe @rossberg or @jfbastien disagree). This could even be something as rough as shared tests, where a comment in the tests points to this bug thread.

It just seems unfortunate that we're in this position of having had a miscommunication that might lead to incompatibility between browsers; I'm wondering if we can prevent more of these in the future.

jfbastien commented 6 years ago

I'm OK with an appendix, as long as limits are meaningful for browser and non-browser embeddings, and make sense for all implementations. When I imported the V8 limits not all of them were sensible for our implementation (here's our list).

titzer commented 6 years ago

I hereby propose we change the function limit size to 7654321 bytes.

lukewagner commented 6 years ago

@MikeHolman Good news; looks like both big functions could be fixed producing a build that validates with the 128kb limit. New builds up soon.

@titzer Sure; I guess that particular number highlights the arbitrariness of the limit.

lukewagner commented 6 years ago

Looks like @Cellule agrees :) I'll land a patch in FF too. @jfbastien y tu tambien?

@littledan was talking about adding an engine internal limits section to the wasm-web spec, so perhaps that could be the resolution of this issue.

Also: the Godot demo has been updated to validate with the original 128kb limit.

MikeHolman commented 6 years ago

Awesome, thanks Luke for following up on that! Glad we were able to get it to meet the original constraints. Unfortunately it looks like Godot requires WebGL 2 support, which Edge is still lacking :(

lukewagner commented 6 years ago

/o\ I'll ask about that too; maybe it's not actually necessary but just used in some incidental way.

jfbastien commented 6 years ago

@jfbastien y tu tambien?

Lo que sea.

WebGL 2 is a deal-breaker for us as well.

titzer commented 6 years ago

@dschuff Suggested on our internal bug to make this a limit of the web embedding. I'm fine with that, although for now we enforce the 7654321 byte limit in the engine itself.

jfbastien commented 6 years ago

@dschuff Suggested on our internal bug to make this a limit of the web embedding. I'm fine with that, although for now we enforce the 7654321 byte limit in the engine itself.

That's a good point, worth considering. @littledan if you add it to the spec I think that what @dschuff suggests is a good idea, so I'd hold off until we decide where it goes.

In any case, this is probably worth a poll at an upcoming meeting.

lukewagner commented 6 years ago

As an update, it looks like Godot simply uses OpenGL ES 3.0, so the dependency on WebGL2 isn't superficial. However, they do have plans to add OpenGL ES 2.0 in a few months which would give us a WebGL1 workload.

kripken commented 6 years ago

The Go language is planning to manage the call stack manually, i.e., without using call or separate wasm functions. Approaches like that might hit imposed function size limits (they might be able to split up the one big function into chunks, but it's not straightforward).

titzer commented 6 years ago

@kripken Seems like that is just a prototype. It would be truly unfortunate, and probably have extremely bad performance, to generate one large function. There was discussion earlier in that thread about ignoring the stack machine and emulating a register machine. Seems like that work is preliminary and we shouldn't base decisions on it here.

kripken commented 6 years ago

@titzer Agreed, yeah. I just mentioned it as it seemed an interesting case for non-standard code generation (given their requirement to support things like bidirectional longjmp, a single big function or an interpreter may be their only options, all of which are not that good obviously).

icefoxen commented 6 years ago

Maybe there should be a specified minimum size instead of a maximum size?

lars-t-hansen commented 6 years ago

The proposal is for a maximum number of functions and a maximum function body size, but the product of those two is a large number of bytecodes (7TB total code section size). Do we want to place an additional limit on the code section size?

(Not sure about this myself. We may end up with a hard limit in SpiderMonkey anyway to accomodate streaming compilation, but this is under discussion, and I'm not sure about the wisdom of legislating it.)

littledan commented 6 years ago

Have the limits been added to any sort of specification document? I'm having trouble finding them.

If they haven't been added yet: We've talked in the above thread about putting the limits in an appendix. Would the implementation limits appendix be an appropriate place, or should it be some other document? Or just in tests?

icefoxen commented 6 years ago

A way to query the limits imposed by the implementation would also be nice. I am reminded of https://www.khronos.org/registry/vulkan/specs/1.0/man/html/VkPhysicalDeviceLimits.html for example.

jfbastien commented 6 years ago

Not done, tracked here: https://github.com/WebAssembly/meetings/issues/100

rossberg commented 6 years ago

Would the implementation limits appendix be an appropriate place

No, since these are Web-specific limitations. The obvious place would be the Web API spec.

HarikrishnanBalagopal commented 8 months ago

Golang seems to have hardcoded a limit of 2^16 bytes = 64KB on function size (WASM page size). Have made an issue documenting the details https://github.com/golang/go/issues/64856

@binji I have gone through the https://github.com/binji/raw-wasm demos repo and implemented a runtime in JS that could run those demos. I haven't found a reason why the function size would have to be limited to just 1 WASM page.

It would be very helpful if someone knowledgeable on the topic could comment on why this limit exists and how to work around it. Thank you!