WebAssembly / design

WebAssembly Design Documents
http://webassembly.org
Apache License 2.0
11.38k stars 694 forks source link

Will there be a JS -> WASM compiler? #219

Closed bguiz closed 9 years ago

bguiz commented 9 years ago

After scrutinizing the design docs, I was able to find a mention of a polyfill that would transpile WASM -> JS. I was also able to find mention of a C++ -> WASM compiler.

However, I was unable to find any mention of a JS -> WASM compiler.

The majority web developers are fluent in Javascript, and thus a JS -> WASM compiler would be ideal. Web developers will want to continue writing their websites using Javascript, instead of writing them using C++. Thus I am not sure what to make of the MVP, nor the post-MVP sections making no mention of a JS -> WASM compiler. What is happening here?

duanyao commented 7 years ago

I guess it is unlikely that JS to WASM compiling would boost sustained performance; however, it may improve code size and parsing time due to the binary encoding, which is still useful.

I think we can just define a binary encoding for JS, and ignore linear memory etc for now. This is simple and polyfillable.

RyanLamansky commented 7 years ago

@kabirbaidhya The main issue with JS -> WASM right now is that you can't build an efficient garbage collector inside of it as there is no way to analyze the stack to see which objects are alive. This means you'd have to place a copy of all object references in linear memory (the heap) and keep it synchronized, seriously degrading performance. It also lacks shared memory multi-threading, so background garbage collection is impossible. Future versions of WASM will be able to tap into the host browser's garbage collection engine, eliminating this problem.

The other major barrier to JS -> WASM is the fact that nearly all objects are fully dynamic. WASM intrinsically expects everything to be purely static, so complex mapping layers, emulation, and dynamic code generation would be needed to approach standard JS performance. Fortunately, TypeScript helps with this, so a strict subset of TypeScript may be able to target WASM to some degree. I know there's at least one person trying to build this.

C/C++ works well with the first release of WASM due to the fact that WASM's limitations are closely aligned with native hardware limitations, which C/C++ are designed to target.

jayphelps commented 7 years ago

FWIW there's a great slidedeck on how V8 handles JavaScript arithmetic: https://docs.google.com/presentation/d/1wZVIqJMODGFYggueQySdiA3tUYuHNMcyp_PndgXsO1Y/edit

tl;dr this is just one example where the reality is much harder than it might seem and in practice isn't very meaningful since the native VM can (and likely will) do a better, faster job since it's truly native and has access to resources and APIs wasm never will--and (probably) most importantly, years of iteration.

That's not to say a subset of JS/TypeScript couldn't proliferate successfully, like ThinScript, TurboScript, etc. They'll look very familiar to JS-programmers at first glance.

I still think these are good questions to ask, and continue asking. It's critical we all understand the use cases and future of WebAssembly--as well as non-goals.

rossberg commented 7 years ago

On 6 April 2017 at 00:36, Ryan Lamansky notifications@github.com wrote:

The other major barrier to JS -> WASM is the fact that nearly all objects are fully dynamic. WASM intrinsically expects everything to be purely static, so complex mapping layers, emulation, and dynamic code generation would be needed to approach standard JS performance. Fortunately, TypeScript helps with this, so a strict subset of TypeScript may be able to target WASM to some degree. I know there's at least one person trying to build this.

Unfortunately, I doubt that TypeScript helps in this regard. To encompass JS legacy, its type system is so deeply and fundamentally unsound that there is no interesting "strict" subset. For example, such a subset would need to exclude any of TS's notion of subtyping, which would make it pretty much useless in its domain.

There have been nice research papers, like e.g. on SafeTypeScript, but not only are they more restricted, they also require substantial amounts of costly additional runtime bookkeeping and checks, defeating the purpose of the discussion (and effectively being a different language than JS/TS).

agnivade commented 7 years ago

Maybe I am not getting something, but one of the ideas of WebAssembly is to directly load the AST to avoid the parse time of js, right ?

So, if we have a tool that compiles js to this ast format and passes that to the browser, won't it benefit from avoiding the time to parse ?

rossberg commented 7 years ago

@agnivade, it's an AST for a completely different, much more low-level language.

If you were to compile JS to Wasm offline, then yes, you wouldn't need to parse on the client side (just decode). At the same time, because JS is so complicated, code size would drastically increase, probably by a factor of 5 or more, which is a much higher cost. (And that isn't even take into account that you probably would also need to include an entire implementation of a JS VM runtime system in Wasm, which easily is megabytes of code.)

Moreover, without a representation of the sources you cannot implement most of the dynamic optimisations that are crucial for getting JS anywhere near fast. These optimisations rely on recompiling the original source code and specialising it based on profiling information. An already compiled Wasm AST doesn't enable that, you'd need an AST of the original source program.

agnivade commented 7 years ago

@rossberg-chromium - Thanks a lot. That clears up a lot ! One doubt though -

And that isn't even take into account that you probably would also need to include an entire implementation of a JS VM runtime system in Wasm, which easily is megabytes of code

Why would you need the VM runtime system ? Isn't the browser itself the VM runtime ? I just want the code to be in the AST format so that the browser can readily start executing it. I get that the net size will increase because the language itself is complex, and we cannot implement dynamic optimisations. But why do we need to bundle the VM runtime itself, when we have the browser for that ?

rossberg commented 7 years ago

@agnivade, without dynamic optimisations JavaScript will be slow, and I mean really slow, like 100x slower, maybe worse.

By "runtime" I don't mean browser stuff like the DOM, but the bare JS language support, i.e., things like garbage collector, object representations, primitives and base libraries, etc. That is pretty huge for JavaScript, and you'd need a reimplementation of all of it inside Wasm.

And of course, you'd also need an interface layer to the DOM.

agnivade commented 7 years ago

Ok I think I understand things a bit better now. I thought that the

garbage collector, object representations, primitives and base libraries, etc.

can be used from the browser itself. And I can just let the browser load the AST and do its usual job. But now I realize that everything needs to be packaged inside WASM itself.

distransient commented 7 years ago

A universal-ish scripting language bytecode would be interesting though! A compile target designed around efficiently transporting and executing programs written in dynamically typed, garbage collected languages, with all the bizarre edge cases of the popular ones (javascript, ruby, python, lua) covered in (some cases) special opcodes and structures etc

rossberg commented 7 years ago

@distransient, so you want the combinatorial insanity of all the scripting languages? I'm optimistic that it would be possible for humanity to gather the engineering resources to specify and implement that efficiently by 2050. :)

nidin commented 7 years ago

Those who interested in compiling TypeScript to WebAssembly using LLVM. check out this reach project. https://github.com/MichaReiser/speedy.js Looks like this discussion is never ending...

distransient commented 7 years ago

@rossberg-chromium I said it would be "interesting", not easy or pretty 😉

carlsmith commented 7 years ago

A universal-ish scripting language bytecode would be interesting...

While WASM is incrementally evolving to eventually support stuff like Python, we could have first-class support for developing scripting languages for the Web much sooner than WASM can provide it, if we approached the problem from the opposite end at the same time.

It should be relatively simple for JavaScript engines to expose their ability to execute JavaScript ASTs, and the ASTs they accepted could be standardised (even if they're immediately converted to a non-standard, intermediate format internally).

We could simply combine an AST format (like estree) and a serialisation format (like JSON) to create a new file format with a new extension. If browsers supported the format in script tags and so on, then languages like TypeScript and CoffeeScript would just compile to parse trees, and the browser would take it from there. Transpiled languages wouldn't need to do code generation, and source maps would no longer be needed either, as the lexical information would be based on the actual source.

Once the basic support was established, the standard could incrementally evolve to meet WASM in the middle, by basically just adding new node types. There are simple things to start with, like explicit add and concat nodes, or maybe adding new data types, like DEC64.

As WASM builds up to supporting scripting languages, by adding things like GC, AST execution would move downwards, extending JavaScript semantics to include features from other high level languages, so a broader set of scripting languages could compile to a kind of abstract JavaScript.

rossberg commented 7 years ago

On 25 May 2017 at 02:57, Carl Smith notifications@github.com wrote:

There are some issues that would need addressing, but it would be relatively simple for JavaScript engines to expose their internal support for executing JavaScript ASTs, and the ASTs they accept should be standardised (even if the AST is immediately converted to non-standard, intermediate formats internally).

Only for a much broader definition of "relatively simple" than you probably have in mind... ;)

carlsmith commented 7 years ago

Relative to WASM, it's simple.

ivanherczeg commented 7 years ago

@bguiz For example:

Google V8 engine already compiles the JavaScript directly to native machine code, by compiling the whole runtime task, before executing it.

So it would be totally unnecessary to have a alternative WASM pipeline from client side.

In the other hand, WASM was presented with a Mandelbrot demo, then it features Unity "Tanks" demo in the first place, but i doubt very much that drawing pixels with ASM->CPU (even with SSE double precision) could ever be faster than WebGL->GPU, because as this community says the GPU is not in the scope. So what?

SephReed commented 6 years ago

@ivanherczeg Woah! Where does this community say GPU is not in spec?

ivanherczeg commented 6 years ago

@SephReed

We already have tensions due to bikeshed differences between arm and x86. I think that adding another set of hardware targets would create more tension: more operations would either have to be slow due to emulation costs to get uniform semantics on all targets, or more operations would have to have undefined behavior to allow everyone to run fast. I think that makes it unprofitable to consider the GPU at this time (or ever).

-Fil

https://github.com/WebAssembly/design/issues/273#issuecomment-123094583

nirus commented 6 years ago

C# runtime was ported to wasm and was fully functional prototype replacing JS completely. So this means in future you can expect runtimes emerging out to replace JS on browsers and write client side web apps in Java, C# or even C++ with a statement's saying "Code will run faster near native", "Compiled code are faster than VM" or anything without the aid of JavaScript.

Please watch this video of what i am trying to say.

WebASM was introduced to supplement JS not to take over completely , replacing the First class language.

Near future you can expect webpages delivered from server compiled natively

Steakeye commented 6 years ago

https://github.com/ballercat/walt

BossLevel commented 6 years ago

@Steakeye Very nice :) I shall have a play - many thanks for highlighting :)

2beers commented 6 years ago

you can compile JS to WebAssembly using NectarJS . Demo: http://nectar-lang.com/ choose from the dropdown WebAssembly

kripken commented 6 years ago

Interesting, the NectarJS demo uses emscripten, you can see that in the asm.js output. It appears it statically compiles JS into something - likely C or LLVM IR - and then runs that through emscripten.

The wasm output also uses emscripten (can be seen from inspecting the binary), but it seems to use an old version as it emits 0xd wasm binaries, which don't run in modern VMs. It also just sends you the wasm, not the JS, so it's not runnable anyhow. In any case, it's very possible it's just doing the same as for asm.js, just running emscripten with the flag for wasm output flipped on.

The demo has a 300 byte limit on the input, so it's hard to feed it a real-world program to get a feel for how powerful their analysis is, which is the real question with a static approach like this. In general, academic research on this topic suggests skepticism.

Simran-B commented 6 years ago

Their compiled demos for Windows simply crash for me 🤕

alexp-sssup commented 6 years ago

I agree with @kripken skepticism here. I believe arbitrary JS cannot be reasonably converted to WebAssembly. Any tool that claims to achieve this is probably working on some tractable subset of the JS language, or giving up execution performance.

JS is an extremely dynamic language. Unpredictable run-time operations can significantly and globally change the semantics of code. This means that an Ahead-Of-Time (or offline) compiler can only assume the worse and generate very inefficient generic code that can handle all the possible cases. For an example take the following JS code:

var a = {prop1: 1};
func(a);

could be converted (in pseudo-wasm) to this

i32.const 42
call $CreateJSValFromStrTable ;; Returns prop1
i32.const 1
call $CreateJSValFromInt
call $CreateJSObj1 ;; Consume a JS string and a JS value to make an object
call $_func

Now, this is a far call from what we can reasonably consider "compile" and it is more similar to unrolling an interpreter. It is of course also possible to run a JS interpreter compiled to Wasm, but that would hardly be a performance win.

JS engines such as V8 and Spidermonkey can run JS code as fast as they do by compiling it Just-In-Time. By doing JIT compilation they can observe what is the real intended semantics for a given piece of JS and generate fast code for that specific case, while of course being careful to detect any change in the environment that could invalidate the current assumptions.

Simran-B commented 6 years ago

Agreed. I wouldn't mind to use a JavaScript subset however. Or maybe a typed variant, which would probably reduce the dynamism and allow for more efficient code to be generated.

Are there any news on the "strong mode" front BTW?

rossberg commented 6 years ago

@Simran-B, we have long abandoned strong mode, for the reasons summarised here. The takeaway is that it is pretty much impossible to tighten JavaScript semantics without losing interop with existing code.

For the same reason I also don't have much hope for the idea of designing a "statically compilable" dialect of JavaScript or TypeScript -- it would be a different language that can't run existing code, so not much point.

kripken commented 6 years ago

@Simran-B : "I wouldn't mind to use a JavaScript subset however. Or maybe a typed variant"

There is some very interesting work in that space, like AssemblyScript which is a strict subset of TypeScript that compiles to WebAssembly, https://github.com/AssemblyScript/assemblyscript

@rossberg : "I also don't have much hope for the idea of designing a "statically compilable" dialect of JavaScript or TypeScript -- it would be a different language that can't run existing code, so not much point."

I think the big potential with things like AssemblyScript is not about running existing code (I agree with you there, that won't be feasible in general), but that having a friendly and familiar language is a huge deal.

Right now if you are a TypeScript developer and you want to write WebAssembly then you need to use C++ or Rust. Both are good languages but also have downsides. For someone with that background, something like AssemblyScript could be the fastest path to productivity.

SephReed commented 6 years ago

If AssemblyScript can compile to both JavaScript and Assembly, that would be pretty ideal. Looking forward to these updates.

Also, in the future, unless someone does it first, I'll probably try writing a TypeScript -> Assembly Script converter that goes through the files, asks the questions it needs to ask, and then makes the conversion. Hopefully it works out!

Pauan commented 6 years ago

@SephReed Yes it can compile to JavaScript, because there is a WebAssembly -> asm.js compiler, which should work with all WebAssembly code.

See also the "Can WebAssembly be polyfilled?" section of the FAQ.

If you instead meant "is it possible for AssemblyScript to compile to idiomatic JavaScript code?", then I have to ask, why would you want to do that when WebAssembly / asm.js are so much faster than idiomatic JavaScript code?

Though I suppose you should be able to run the AssemblyScript code through the TypeScript compiler. However you will need to keep certain things in mind.

See also this section of the AssemblyScript documentation.

qm3ster commented 6 years ago

Gentlemen, please consider WALT, the JavaScript-like WebAssembly language.

jerrygreen commented 4 years ago

UPD. Sorry for necroposting

I see a lot of people consider this "JS -> WASM" compiler a good idea.

For those who don't find it useful, like:

I'm not sure it'll be that useful from a developer's perspective, though. You may get some size reduction, but that's about it. From a browser's perspective it may be interesting to have the JS engine implemented in wasm from a pure security perspective.

Please, here's my concrete example of why it's important, and why it's useful, and not just you "get some size reduction, but that's about it". One of the features come with WebAssembly is:

<=XXX «SaNdBoXeD EnViRoNmEnT» XXX=>

WebAssembly isn't just about performance. You may see a good article about plugins from Figma team.

Making a plugin system is quite challenging. You need some good way to run custom code. You need a separate environment, a safe one.

WebAssembly gives you that, - a pure environment without mess like some global variables. AssemblyScript makes it convenient in a way, - you have almost the same TypeScript environment, as your main app's environment, which is quite cool.

But here's the problem, "almost same":

Can I use JS packages from NPM within my safe environment?

No.

Well, this WALT project is some kind of AssemblyScript alternative. It's barely JS-like, - it's typed js. It's more like TS-like. You can't compile/transpile existing js libraries with that.

Can I use TS packages from NPM within my safe environment?

No.

AssemblyScript is TS-like language too. It may compile something written in TS if it's fully covered with types. No exceptions. No any any's. But often people have their configs not strict enough or they have a few @ts-ignore, or even more often, - they write package in js and provide separate types in .d.ts files - in all these cases you won't be able to compile such a package to WASM.

asilvas commented 4 years ago

@JerryGreen good points, but on the performance side of things, I actually believe it's a huge misconception that there aren't significant performance benefits beyond saving a few bytes. Folks, including benchmarks, are so obsessed with runtime performance. See how fast it runs 3D games?

Yet the real-world opportunity is actually in startup performance, which benefits virtually all websites. Few seem to talk about how WebAssembly is substantially faster in startup time (per byte), far beyond any runtime benefits. This is why for instance gzip on textual content, such as JavaScript, has little real-world impact on PLT -- it's the size of the compiled code that matters.

Ironically, the industry is obsessed about PLT (Page Load Times), and various visual complete markers, yet no one sees the correlation between WebAssembly and these vectors? JavaScript is responsible for over 30% time spent prior to these critical events, on most websites. In fact, size of pages and bandwidth have far less impact on PLT's compared to that of linear performance factors, namely JavaScript startup times and latency.

With that said, it isn't clear to me the feasibility of JS -> WebAssembly.

MaxGraey commented 4 years ago

@JerryGreen Figma's approach is very specific case and I guess for most of projects iframes or realms are pretty enough for third-party javascript isolation. For special cases where isolation should be more controlled and performance, size and load time are not so important, you could always compile QuickJS or JavaScriptCore to WebAssembly.

j-f1 commented 4 years ago

You could also use Web Workers, and run code before your untrusted code that deletes any APIs you don’t want the untrusted code to have access to. No need for WASM in this case @JerryGreen!

metacritical commented 4 years ago

Framerate Drops in Three js in a real thing, I am not sure if wasm could help but it sure seems so at least on the surface.

chpio commented 4 years ago

There is no reason to compile JS to wasm because you would have to also include a whole javascript vm. The resulting code would be huge and slower than the JS VM natively provided.

Couldn't we do all the monomorphisation etc that are done by JS VMs through Profile-Guided Optimization? We would pretty much just do the same thing as the JS VMs do at runtime, but ahead-of-time.

A PGO build consists of two passes: a first pass to build instrumented binaries, then a second pass to re-build optimized binaries using profile information gleaned from running the instrumented binaries.

The first run would provide us with all the type info (which functions get called with which typed-arguments etc), then we build an optimized binary with all variants a function is called with (+ generic one with dynamic args for non profiled code). We wouldn't need the whole JS VM.

MaxGraey commented 4 years ago

PGO required great test's coverage of your program. It's not always possible. But you could trace some type information during execution in v8. See this doc: https://docs.google.com/document/d/1JY7pUCAk8gegyi6UkIdln6j_AeJqQucZg92advaMJY4/edit#heading=h.xgjl2srtytjt

Nashorn commented 4 years ago

We have spoken with the TypeScript team about this possibility and they have shown interest, but it seems like progress there is currently gated on adding typed objects into JS.

Don't need types

Zireael07 commented 4 years ago

Can QuickJS really be compiled to WASM?

MaxGraey commented 4 years ago

Yes, Figma use QuickJS for their plugin system for example

binji commented 4 years ago

And it's used in http://numcalc.com/ too.

HarikrishnanBalagopal commented 1 year ago

Anyone familiar with this project that compiles SpiderMonkey to WASM? https://github.com/bytecodealliance/spidermonkey-wasm-rs https://wapm.io/mozilla/spidermonkey