BabylonJS / Babylon.js

Babylon.js is a powerful, beautiful, simple, and open game and rendering engine packed into a friendly JavaScript framework.
http://www.babylonjs.com
Apache License 2.0
23.14k stars 3.42k forks source link

Port to asm.js/webassembly, is it worth? #3248

Closed innerground closed 2 years ago

innerground commented 6 years ago

Hello, I am currently doing some experiments with asm.js/webassembly and I have to say that the perfs potential is great! As I love babylon.js (Thanks guys!), I was thinking about porting It to asm.js/webassembly (Unity is actually generating some low level code for asm.js for optimization). I am working on a viewer to display and interact with quite big models (Via GLTF/GLB). Anyway, my question is : Do you think that It is worth the work?I am pretty sure that some of you guys thought about the same, and I am curious to know what do you think. Looking forward to reading your messages. Regards,

sebavan commented 6 years ago

The thing we thought about would be to only convert our hot path in wasm. Basically Matrix and Vector Maths as well as adding a Resources store to ensure they are all stored in the same buffer.

innerground commented 6 years ago

Yes, math should be the first thing to convert.Anyway, It is also possible to convert the whole library in C++ then converting It back to JS using emscripten, but I am not sure about the performance increase...

innerground commented 6 years ago

The main problem I am facing is that the engine is very slow when you got thousands of nodes (And meshes)

sebavan commented 6 years ago

The thing is with thousands of node and mesh wasm will still not be sufficient as after the issue would be around binding the info to webgl

innerground commented 6 years ago

I already worked on some workarounds about that, If we can gain some perfs, that is good anyway.I am just wondering if I should give It a go.

sebavan commented 6 years ago

I was planning to do it soon :-) but not Before Jan I guess.

innerground commented 6 years ago

If you need help, no probs, I am experienced with C/C++/JS and others.

sebavan commented 6 years ago

Give it a try if you wish I would be able to focus on other issues instead :-)

innerground commented 6 years ago

@sebavan, As Babylon exists in TypeScript, I think that It is going to be easy to port It. There is a TS to Haxe converter, then Haxe to C++, do you think that we can give It a go?!

sebavan commented 6 years ago

you can for test pupose this might help bootstrapping but if it proves working we ll need to carrefully craft a blazing fast math lib :-)

innerground commented 6 years ago

I know I know, I will create a repo for that this week end.I will keep you updated.

innerground commented 6 years ago

@sebavan 3MB of JS to convert by hand to C/C++ is a pain :D

sebavan commented 6 years ago

Could you not only convert the Math part ??? (might be easier)

vujadin commented 6 years ago

https://github.com/samdauwe/BabylonCpp

vujadin commented 6 years ago

https://github.com/vujadin/BabylonHx

nbouayad commented 6 years ago

@sebavan Which parts you want to be converted first?

sebavanmicrosoft commented 6 years ago

I was thinking in Babylon.math.ts: the Matrix, Vector and quaternion classes. This is our cpu hit on large scene (deep and wide hierarchy of nodes).

Having a large native array fixed size and considering any of the Matrix/Vector... pointers to the array would prevent the expensive go and back from the wasm context and still allow all the contribution in TS/Javascript.

We would like to collect data around this part to evaluate the gain before going further.

Does it sound reasonable ?

nbouayad commented 6 years ago

Right,let's try to put in place some benchmarks first.

Le 1 déc. 2017 09:16, "Sebastien Vandenberghe" notifications@github.com a écrit :

I was thinking in Babylon.math.ts: the Matrix, Vector and quaternion classes. This is our cpu hit on large scene (deep and wide hierarchy of nodes).

Having a large native array fixed size and considering any of the Matrix/Vector... pointers to the array would prevent the expensive go and back from the wasm context and still allow all the contribution in TS/Javascript.

We would like to collect data around this part to evaluate the gain before going further.

Does it sound reasonable ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BabylonJS/Babylon.js/issues/3248#issuecomment-348430658, or mute the thread https://github.com/notifications/unsubscribe-auth/AGr5r6RR9_QDHkCQbb6YyIHhmx_suCt2ks5s77X1gaJpZM4Qvaf2 .

jbousquie commented 6 years ago

May I let here my feedback about asm/wasm regarding BJS as I made some studies and experiments ?

First of all, is asm/wasm worth it ? Always !

It's really really faster than JS, especially when you have to deal with huge amount of data. You'll hardly see the difference between JS and asm/wasm on a single maths computation, but when getting to dozens of thousands per frame, it really makes the difference.

That said, we have to know how asm/wasm works in the browser. Asm/wasm doesn't have a direct access to the DOM. This means some JS code is still required anyway to manage the UI events (user interactions) and the WebGL layer access. This part of JS code is mandatory. This implies immediately that BJS can't be translated at once to C/C++ then to BJS. Some parts of the code have to keep in JS to manage/orchestrate the i/o (user interaction, final rendering).

Other issue : BJS is a JS framework. This means that the final user will be able to code his game/scene logic in JS. If the main part of BJS is translated to asm/wasm, then every method/object from the BJS API must be exposed/binded from asm/wasm, what is not a simple task. Simple when you expose dozens of asm/wasm functions only, quite complex when it comes to thousands. Why ? Because of the way the data are passed from JS to asm/wasm and back : there's no object or type compatibility between these two contextes. Asm/wasm doesn't know any JS types nor objects. Everything must be passed in a single shared memory heap and the atomic shared element is ... the byte. For instance, if JS must pass integers (indices), floats (positions/quaternions, etc), strings (names) or higher level objects, then everything has to be converted in bytes and to be put in the same memory heap to be exchanged. Yes, arrays of integers, floats, utf-8 encoded characters, etc in the same buffer ! I let you imagine how fun it can be to play with byte offsets to know where/what you are dealing with. Note also that this buffer is statically allocated when starting your asm/wasm code. So what size ? 2 Mb ? 64 Mb ? no idea unless you know what will be managed in your scene : 1 mesh, 2000 meshes ? physics or not ?

Now, let's imagine we have achieved to implement such a generic way to run asm/wasm with the right heap size, all the methods exposed to JS, the JS orchestration part rightly decoupled. Ok ? The user logic will still be written in JS, won't it ? Yet, this is often the place where the bottleneck is ... not in the framework calls actually.

Imho, the best way to get real gain with asm/wasm would be that :

So the user would code, say, in TS (when compilable to wasm) or Haxe or whatever in the same language than the framework. Both (user logic + framework) would be compiled in the same final bunch : no painful communication between the user logic and the framework, only the pre-set exchange channels are then used between the compiled code and JS orchestration code (browser events + webgl rendering).

I guess that's the way Unity exports the code to wasm.

Another approach would be to implement then only some parts of the BJS current code, like some maths computations for instance. Well, the gain is really poor (say for single quaternion computation or a matrix transformation applied to a vector3) from JS to wasm regarding a single computation. What is then worth it would be to migrate the iterations (the loop) wasm side. Say, if you have 1000 quaternions and 1000 rotation matrices to compute each frame, there's just a little gain to call 1000 times the wasm version of this computation (not so little, but not that important compared to the CPU bottleneck that is usually in the user logic). But there's a better gain to compute 1000 times both these computations inside the wasm code directly. Then the complexity comes from the way to migrate all the loops/iterations wasm side, knowing that we still will have to iterate at least once JS to copy the all data in the memory heap to pass them to the wasm code.

Not that simple in every case ...

So there's ever this duality :

jbousquie commented 6 years ago

Anyway I think any attempt in this direction is worth it.

Simply because the JS JIT compiler can't now get far more faster, neither the CPU. So the only ways to get faster in the browser are now wasm compilation and concurrency (workers).

sebavanmicrosoft commented 6 years ago

Yup, no worries I will create an ugly scene full of cube all parented to a deep hierarchy with a cheap shader to really measure the isolate the cpu impact.

I am really curious on the gain. My first experiments on maths vector in wasm sharing the buffer and only indexing on it were actually not too bad.

jbousquie commented 6 years ago

I did a asm test (not published for now) about a turboSPS. 40K solid particles moving and rotating... meaning a quaternion and a rotation matrix (-like) being computed for each one The asm code was written by hand. There's a substantial gain compared to the full JS version, but not as much as the same worker version: computation distributed among 4 workers in full JS, each dealing 10K particles only simultaneously.

What I would like to achieve is to migrate the 40K iterations in the asm code instead of calling it 40K times, because I did some very basic big loop tests and they are always fast faster in asm : a 500K iteration loop, just assigning a simple float addition result to each of a 500K element array. It's twice faster in asm than in full JS.

Note : I used asm instead of wasm because it can be easily written by hand (no need for C, then intermediate compilation) and performances in FF are for now quite comparable to wasm.

But Wasm keeps the way to go imho.

jbousquie commented 6 years ago

BTW the project AssemblyScript / Next looks promising, but I'm afraid that the lack of contributors prevents it to get mature : https://github.com/AssemblyScript/assemblyscript https://github.com/AssemblyScript/next

current discussion : https://github.com/AssemblyScript/next/issues/1

tl;dr, a language the closest possible to TS and a compiler emitting directly wasm bytecode.

innerground commented 6 years ago

Right, been playing around.The ts to hx option is not what we want at all, It is painful and not reliable. Now, we got two options : a 1 to 1 translation of the actual code or, a smarter port of the code (Ie. Color3,Color4 and Vector2,Vector3 could be templated for example). That really depends on what we can consider as stable code vs evolving code. Concerning the numeric precision, I was thinking that float is enough but maybe you can comment that out (number vs native types). My proposal is therefore to differentiate what is "static/mature" and what can evolve, so we can have a good software life cycle plan.

sebavan commented 6 years ago

The math tools is having a really low code churn in BJS and is pretty stable so up to you for the templating or not. Color is almost never use in Math for operations cpu.

float vs double is an interesting one :-) might be cool to at least alias the type and test both perf ?

nbouayad commented 6 years ago

From my experience (as GIS engineer), double is a pain.As an obvious example,how can you consider zero? Generally speaking zero is 10e-6... but that is a point of view.

Le 2 déc. 2017 20:24, "sebavan" notifications@github.com a écrit :

The math tools is having a really low code churn in BJS and is pretty stable so up to you for the templating or not. Color is almost never use in Math for operations cpu.

float vs double is an interesting one :-) might be cool to at least alias the type and test both perf ?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/BabylonJS/Babylon.js/issues/3248#issuecomment-348713954, or mute the thread https://github.com/notifications/unsubscribe-auth/AGr5r9pVWt0AjknJBUBF4oZt1gnXvbqIks5s8aP1gaJpZM4Qvaf2 .

innerground commented 6 years ago

@sebavan In order to limit the problems with garbage collection, please point me out exactly what functions you want to get converted.Thanks.

sebavan commented 6 years ago

in the code itself we should not allocate but reuse most of the matrices and vector. Doing them all could be a good way to validate it :-) Also as those classes would become an offset into the native array it may be tough to only do a subset ?

innerground commented 6 years ago

@sebavan I do not understand what do you mean? Do you mean inplace transformations (So no allocation)? I do not know the math part of Babylon deeply, so maybe you can give me some hints.

sebavan commented 6 years ago

Yup in Babylon, all the render loop based transformation are happening "in place" to prevent over garbage collecting.

innerground commented 6 years ago

@sebavan Check : https://github.com/innerground/Babylon.js.asm I started do transpose some math calculation. Now, I am not sure If we should pass vector3 as a float array or 3 values...I am going to gradually add all the math processing there. Tell me what you think.

wpdildine commented 6 years ago

this is really cool im going to do a pull and take a look when i get a moment.

veikkoeeva commented 6 years ago

A note to the discussion, https://github.com/aspnet/Blazor is coming along and has a VSIX template for VS (and dotnet new) currently. I do realize .NET Standard 2.0 isn't for everyone and that it's currently more for Windows people (due to VS VSIX template, I'm not sure about VS Code).

deltakosh commented 6 years ago

@jbousquie what about trying to compile math to wasm with https://github.com/AssemblyScript/assemblyscrip? Would like to play with a real world example :)

Kjue commented 6 years ago

Just noticed this. Any relation to participants?

GLMW – WebAssembly Powered Matrix and Vector Library - https://maierfelix.github.io/glmw/

Nodragem commented 6 years ago

So ... is it a thing? BabylonJS 3.3 will have some parts using WebAssembly ?

RaananW commented 6 years ago

Still being investigated. There are a lot of factors to consider and we need to see what is the performance gain.

pierreglibert commented 6 years ago

Hi community, I found a very good demo for webassembly : http://aws-website-webassemblyskeletalanimation-ffaza.s3-website-us-east-1.amazonaws.com/

Benchmark for 50 animations : js : 10 fps. wasm : 52 fps.

Benchmark for 100 animations : js : 6 fps. wasm : 26 fps.

I note it here for your documentation :)

fmmoret commented 6 years ago

So cool!

fmmoret commented 6 years ago

https://github.com/sessamekesh/wasm-3d-animation-demo if you're looking for it

vtange commented 6 years ago

@deltakosh I went ahead and took a stab at translating Babylon.Math to AssemblyScript. The hope was to get the whole Math class into wasm, load it and replace the otherwise JS-based Babylon.Math with the wasm version.

And then I hit some roadblocks.

  1. in the Math file there is an interface ISize. ISize is not only used here but is exported and referenced in other places in BJS. I don't think AssemblyScript supports interfaces yet.
  2. the Viewport class contains references to Engine. How should I tackle this since I don't see a way for AssemblyScript to reference Javascript and this is before it's compiled to wasm. Converting Engine to AssemblyScript doesn't sound like a great idea. This was imo a showstopper and I can't move forward with the original plan here.

All said, there were many methods in Math that are computationally simple and don't really seem to need a wasm translation. I think a better approach would be to build a helper class or a list of wasm functions that just help with the bigger, more expensive math functions seen in Math, provided they don't reference other Babylonjs components. I'll take a look at this approach if I find time.

Anyways, my commits are at https://github.com/vtange/Babylon.js/tree/math2wasm for those who want to look.

jbousquie commented 6 years ago

Really nice work.

What language did you use to emit the final WASM ? C ? C++ ? Rust ?

I share your point of view about the fact that translating computationnally simple functions from JS to WASM is probably not worth it considering the expected gain in regard of the code complexity.

I'm currently starting some tests on some parts requiring dozen of calculus per iterations on huge loops (ex : dozen thousands solid particles to be moved and rotated). I'll emit WASM from AssemblyScript in order to lower the translation cost (TS to better typed TS).

Le 02/09/2018 à 22:35, Victor Tang a écrit :

@deltakosh https://github.com/deltakosh I went ahead and took a stab at translating Babylon.Math to AssemblyScript. The hope was to get the whole Math class into wasm, load it and replace the otherwise JS-based Babylon.Math with the wasm version.

And then I and hit some roadblocks.

  1. in the Math file there is an interface ISize. ISize is not only used here but is exported and referenced in other places in BJS. I don't think AssemblyScript supports interfaces yet.
  2. the Viewport class contains references to Engine. How should I tackle this since I don't see a way for AssemblyScript to reference Javascript and this is before it's compiled to wasm. Converting Engine to AssemblyScript doesn't sound like a great idea. This was imo a showstopper and I can't move forward with the original plan here.

All said, there were many methods in Math that are computationally simple and don't really seem to need a wasm translation. I think a better approach would be to build a helper class or a list of wasm functions that just help with the bigger, more expensive math functions seen in Math, provided they don't reference other Babylonjs components. I'll take a look at this approach if I find time.

Anyways, my commits are at https://github.com/vtange/Babylon.js/tree/math2wasm for those who want to look.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/BabylonJS/Babylon.js/issues/3248#issuecomment-417957955, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAtCQF1bp6iZFvn_up9U3OH8zfp9YtLks5uXED6gaJpZM4Qvaf2.

deltakosh commented 6 years ago

Can't agree more. We should define a list like collisions or solid particle system

fmmoret commented 5 years ago

https://hacks.mozilla.org/2018/10/calls-between-javascript-and-webassembly-are-finally-fast-🎉/

[X] spidermonkey

Just need chrome & edge to follow suit and we could swap out even kinda small parts

deltakosh commented 5 years ago

This starts to LOOK REALLY GOOD :)

jbousquie commented 5 years ago

yep... things are goinf the right way now :-D

pkieltyka commented 5 years ago

as well, its a matter of time before assemblyscript is mature enough to more easily compile babylonjs, the more contributors +/ sponsors on that project, the better!

vtange commented 5 years ago

Long writeup coming through. Sorry if it seemed like I dropped off. I'm pretty busy with my own BJS project :)

@jbousquie I'm using assemblyscript itself to make the wasm. It's math only, and I'm focusing on the functions that I can figure out are called often/generate lotsa garbage.

So I dropped what I did last time and went with a WASM "scratchpad" approach. Basically the idea was to compile a list of methods in WASM that does all the math and call those methods from JS. WASM will take in a batch of arguments, do the math and store it in WASM-JS-shared memory. Since JS is basically singlethreaded it's technically possible to just have JS tell WASM to crunch all the numbers and then copy off WASM's "scratchpad" where it stored all the answers.

A lot of functions do relatively similar math. for example I have a function that just adds 3 pairs of numbers to each other. like this:

export function add3Pairs(r: f64, r2: f64, g: f64, g2: f64, b: f64, b2: f64): void {
    store<f64>(0,r + r2);
    store<f64>(8,g + g2);
    store<f64>(16,b + b2);
}

which can be used to add stuff for Color3s and Vector3s, like so:

    Vector3.prototype.addInPlace = function (otherVector) {
        this.x += otherVector.x;
        this.y += otherVector.y;
        this.z += otherVector.z;
        return this;
    };

becomes

    Vector3.prototype.addInPlace = function (otherVector) {
        exports.add3Pairs(this.x, otherVector.x , this.y, otherVector.y , this.z, otherVector.z);
        this.x += readWasmMemAsF64[0];
        this.y += readWasmMemAsF64[1];
        this.z += readWasmMemAsF64[2];
        return this;
    };

Now it's sorta too early to celebrate. I wrote some tests testing this idea by running add using JS and add using WASM 10000000 times and the results are unsurprisingly disappointing cause we're using WASM to do simple stuff like a + b.

color3.addWasmStyle = function()
{
  exports.add3Pairs(this.r,0.0001,this.g,0.0001,this.b,0.0001);
  this.r = readWasmMemAsF64[0];
  this.g = readWasmMemAsF64[1];
  this.b = readWasmMemAsF64[2];
  return this;
}
color3.addJS = function()
{
  this.r += 0.0001;
  this.g += 0.0001;
  this.b += 0.0001;
  return this;
}

console.time("js add");
for(let i=0; i<10000000; i++)
{
  color3.addJS();
}
console.timeEnd("js add");
//-----
console.time("wa add");
for(let i=0; i<10000000; i++)
{
  color3.addWasmStyle();
}
console.timeEnd("wa add");

This is on Chrome: js add: 62.114013671875ms wa add: 800.100830078125ms

I don't know what the guys at Mozilla fed Firefox, but it must be really good. We need more of it! FF: js add: 50ms wa add: 247ms

And that's for SIMPLE a+b! From Clark's writeup we can infer that everytime JS does something with numbers, it needs to wrap the answer in a "box" everytime. That means if you even start doing longer chains of math like:

export function superAdd(r: f64, r2: f64, g: f64, g2: f64, b: f64, b2: f64, a: f64, a2: f64): void {
    store<f64>(0,r + r2 + g + g2 + b + b2 + a + a2);
}

export function superMultiply(r: f64, r2: f64, g: f64, g2: f64, b: f64, b2: f64, a: f64, a2: f64): void {
    store<f64>(0,r * r2 * g * g2 * b * b2 * a * a2);
}

vs

color3.superAddJS = function()
{
  this.r = this.r + 0.0001 + 1.31140 + 0.0051 + 1.311210 + 0.0701 + 1.329 + 10.12144;
  return this;
}
color3.superMultiplyJS = function()
{
  this.r = this.r * 0.0001 * 1.31140 * 0.0051 * 1.311210 * 0.0701 * 1.329 * 10.12144;
  return this;
}

The numbers start getting closer: Chrome: js supermultiply: 208.1201171875ms wa supermultiply: 806.9228515625ms js superadd: 140.32470703125ms wa superadd: 822.147216796875ms

Now Firefox is just showing off. FF: js supermultiply: 204ms wa supermultiply: 265ms js superadd: 133ms wa superadd: 256ms

vtange commented 5 years ago

I also wrote a function that should be able to scan all the methods of a given class in Math and test which ones are the slowest by running each of them 10000000 times. This is a rough sneak peak for Color3, only functions that don't return a new Color3() since I'm pretty sure BJS doesn't do that; it generates new objects once and then adds/mults/maths InPlace, right?

toString: 2111.466064453125ms getClassName: 1081.635009765625ms getHashCode: 1195.595947265625ms toArray: 3359.76513671875ms toLuminance: 1930.287841796875ms multiplyToRef: 3183.98388671875ms equals: 2025.455810546875ms equalsFloats: 3914.947265625ms scaleToRef: 8653.037109375ms scaleAndAddToRef: 9073.156982421875ms clampToRef: 5326.75830078125ms addToRef: 3619.27490234375ms subtractToRef: 3544.868896484375ms copyFrom: 2292.550048828125ms copyFromFloats: 4110.578857421875ms set: 4111.8271484375ms toHexString: 17046.4169921875ms toLinearSpaceToRef: 6487.251953125ms toGammaSpaceToRef: 7825.93798828125ms

I'll need to write a custom one for each class so it'll take some time to do properly.

I have a pretty major project using babylonjs so I'm pretty vested in it becoming as fast as possible. Once we get this and AmmoJS in then the real fun begins.

jbousquie commented 5 years ago

Here are my test feedbacks : https://github.com/AssemblyScript/assemblyscript/issues/263#issuecomment-422698225

http://www.html5gamedevs.com/topic/38859-web-assembly/?do=findComment&comment=228634

http://www.html5gamedevs.com/topic/32817-sps-experiments/?do=findComment&comment=228635