fmmoret commented 5 years ago

I want to preface this with: this feature request is outrageous and probably not going to be done, BUT some ideas here might inspire some ideas that are reasonable in size/scope. I just want to squeeze more operations out of my CPU for the purpose of a babylon engine instance.

Feature request

Feature description I'm currently working on a game that's pushing babylon pretty hard on crappy machines. I've been thinking about ways to mitigate small amounts of stuttering & low frame rates. One idea I had is to enable render in a web worker & sync the data back to the main thread for purpose of display. Up until recently, this meant dealing with shared array buffers or doing some crazy message passing, but some nice features have rolled around in recent versions of firefox / chromium that make this fast and easy: Check out the demos in this post and I think a good amount of the value prop is shown there: https://developers.google.com/web/updates/2018/08/offscreen-canvas (note that requestAnimationFrame works in workers now).

I'm curious what you guys think about considering supporting use of offscreen canvases.

I think the most annoying part of trying it out is just trying to duplicate / extend the existing APIs to control Babylon's state over in the webworker. There are some nice (and small / easy to replicate) libraries out there for swapping out class methods that "remove the mental barrier of thinking about postMessage and hide the fact that you are working with workers." like https://github.com/GoogleChromeLabs/comlink

To add to this, I'm wondering if there are ways we could take this to the extreme and try "pipelining" (https://en.wikipedia.org/wiki/Instruction_pipelining) the rendering process to some degree. I know that state will be hard to manage across stages but it'd be interesting is something comparable to the following is possible:

main thread) start some typical render loop. some actions or time-based triggers cause state to change -- tell webworker 1 that these happened webworker 1) do some amount of work, pass state forward to webworker 2 webworker 2) do some amount of work, pass state forward to webworker 3, etc. webworker n) do all necessary draw calls & write to offscreen canvas

The end result hopefully being that when we originally had X amount of work we could do in any 16 ms window, we now have (X - overhead) * n amount of work we can do in the same amount of time (albeit with 0-1 frames of latency added).

I would imagine that organizing the work to be done "could" be arranged into this pipeline shape. The benefit here being that synchronous version (if web workers are not allowed) could work in nearly the same fashion as the multi-processing version.

Additional links that could help implementing the feature: https://developer.mozilla.org/en-US/docs/Web/API/OffscreenCanvas https://github.com/GoogleChromeLabs/comlink https://developers.google.com/web/updates/2018/08/offscreen-canvas

deltakosh commented 5 years ago

This is something I would love to tackle soon (probably in February).Stay tuned!! :)

Edit: I emant for the offscreen canvas. For the pipeline I'm less sure there is an obvious use so far but open to discussion :D

fmmoret commented 5 years ago

Sweet. I'm super excited to see what get's cooked up for the offscreen canvas.

The pipeline concept is hard to get across accurately, but the first minute of this video (it's talking about CPUs here, but the concept can still be applied) illustrates it well: https://www.youtube.com/watch?v=ecCt6HPlPeA

The benefit here being that with N total working processes you multiply your framerate by up to N in the theoretical state where stages are of the exact same size and have no overhead in passing.

In reality, the fastest framerate you can get is 1s / (duration of the longest stage + time to pass the message). Say that our longest stage is some stage that does all the lighting work & that it takes 8ms to do and takes 2ms to pass that data -- the total framerate would be 100fps.

The only difference is that when we're normally running at 100fps, our frame latency will be 10ms (10 ms to generate the frame from our initial state). Whereas in a pipelined system, our frame latency is the sum of duration of all stages plus the total time spent in messaging.

So if we had a rendering pipeline with just a few stages: [ update animations (4ms) ] -> update state with message (2ms) -> [update physics (4ms) ] -> update state with message (2ms) -> [draw all meshes (8ms) ] - this would normally yield ~60fps ( 4ms + 4ms + 8ms) and 16ms of latency, but pipelined yields ~100fps at 20ms of latency (and uses more memory because more workers are working with their own state)

deltakosh commented 5 years ago

This is true if we manage to find a way to quickly share states between workers and so far this is where I still look for a good idea

The share array buffers are array buffers and not structured class data. So the question is: how do I share the scene state (meshes nodes etc) using array buffers.

We really need webthreads :)

benaadams commented 5 years ago

So the question is: how do I share the scene state (meshes nodes etc) using array buffers.

Question, could everything be moved to a worker thread and and Babylon intialized with a OffscreenCanvas?

How much is dependent on the DOM thread?

Input is one aspect (mouse, keyboard, controller etc); so that would need to be messaged to the worker.

deltakosh commented 5 years ago

Input is one but all the scene graph needs to be controlable from the user standpoint and thus just changing mesh.position will need messaging

benaadams commented 5 years ago

I'm thinking of the scenario where user code is also in the worker; though do you mean interacting with the scene graph via a dom representation?

deltakosh commented 5 years ago

Nope I was more thinking about the more common case where you take a dependency on babylon.js and you have your own app / code that runs on the main thread. In that case it would be tough to communicate with the scene graph.

But if that work in your case then it should work. We may be still dependend on some window object but this would be easily fixable (and by the way this is how the nullengine operates so it should be good).

I'm just wondering how to load images in that case as we rely on IMG tag to do so

benaadams commented 5 years ago

I'm just wondering how to load images in that case as we rely on IMG tag to do so

fetch to blob, then use createImageBitmap with the blob?

benaadams commented 5 years ago

https://developers.google.com/web/updates/2016/03/createimagebitmap-in-chrome-50

Using createImageBitmap() in web workers

One of the nicest features of createImageBitmap() is that it’s also available in workers, meaning that you can now decode images wherever you want to. If you have a lot of images to decode that you consider non-essential you would ship their URLs to a Web Worker, which would download and decode them as time allows. It would then transfer them back to the main thread for drawing into a canvas.

Though with OffscreenCanvas you'd use them in place rather than transfering back (or transfer from a second WebWorker to the one with the Canvas)

deltakosh commented 5 years ago

excellent!!

deltakosh commented 5 years ago

Will work with browser vendors to get a more powerful way to deal with real threads: https://www.w3.org/2018/12/games-workshop/

hjlld commented 5 years ago

Only a little "polyfill" can make current BABYLON.Engine and BABYLON.Scene render the default scene of PG ( a sphere and a ground ) successfully on offscreencanvas inside worker thread.

but so far the offscreencanvas can not handle any input from the original canvas, so it's just a good option for non-interactive scenes now.

            // main.js

            var canvas = document.getElementById("renderCanvas");

            var offscreen = canvas.transferControlToOffscreen();

            let worker = new Worker('worker.js')

            worker.postMessage({
                canvas: offscreen,
                width: canvas.clientWidth,
                height: canvas.clientHeight,
            }, [offscreen])

    // worker.js

    let offscreen = e.data.canvas

    // tricky polyfill
    self.document = {
        fullscreen: false,
        mozPointerLockElement: false,
        addEventListener: (e, func) => {},
        createElement: (dom) => {
            return {
                onwheel: () => {}
            }
        }
    }

    self.window = {
        AudioContext: undefined,
        addEventListener: (e, func) => {},
        setTimeout: (func, time) => {
            setTimeout(func, time)
        }
    }

    self.HTMLElement = () => {}

    offscreen.clientWidth = offscreen.width = e.data.width;
    offscreen.clientHeight = offscreen.height = e.data.height;

    var engine = new BABYLON.Engine(offscreen, true, { preserveDrawingBuffer: true, stencil: true, doNotHandleTouchAction: true, audioEngine: false });

deltakosh commented 5 years ago

Love it! thanks a lot

DevelopDaily commented 5 years ago

How is the progress?

I am wondering why nobody here has mentioned the web worker capability of the sister project threejs. It seems very easy for them to implement that. You simply send off-screen canvas obtained from the canvas.transferControlToOffscreen() to the web worker. All the threejs code working on the canvas will automatically work on the off-screen canvas. I hope Babylon JS will offer something like that.

Here is a demo.

https://threejs.org/examples/webgl_worker_offscreencanvas.html

benaadams commented 5 years ago

I had to use a pollfill similar to above but more like

import * as ddsTextureLoader from "@babylonjs/core/Materials/Textures/Loaders/ddsTextureLoader";

export interface IWorkerPolyfill extends Worker {
    document: any;
    window: any;
    HTMLElement: any;
    base: string,
    viewport: Vector4;
    ddsTextureLoader: ddsTextureLoader._DDSTextureLoader;
    requestAnimationFrame(func: Function);
}

const ctx: IWorkerPolyfill = self as any;
ctx.ddsTextureLoader = <any>ddsTextureLoader._DDSTextureLoader;

interface OffscreenCanvas extends HTMLCanvasElement {
    width: number,
    height: number,
    clientWidth: number,
    clientHeight: number
}

ctx.document = {
    fullscreen: false,
    mozPointerLockElement: false,
    addEventListener: (e, func) => { },
    createElement: (dom) => { return { onwheel: () => { } } },
    getElementById: (id) => null
}

ctx.window = {
    AudioContext: undefined,
    addEventListener: (e, func) => { },
    setTimeout: (func, time) => { setTimeout(func, time) },
    innerWidth: 0,
    innerHeight: 0
}

ctx.HTMLElement = () => { }

You can't use jpg/png textures as it creates them with new Image and while that can be polyfilled with createImageBitmap to create an ImageData it uploads the texture as the Image type which I couldn't work out how to shim (as the ImageData would be a property of the Image class rather than the class itself)

However dds textures work happily

benaadams commented 5 years ago

A side effect is none of the webgl tools or inspectors work and chrome thinks your framerate is 1fps; unless you are doing things in the main thread 😂

hjlld commented 5 years ago

I thought spector.js could capture offscreencanvas, as it has an option said 'show offscreencanvas'.

and as i know, our big boss @deltakosh is going to make a new proposal to make workers share structured memory easily instead of postMessage, and to create worker from function with scope inherited.

and here's a proposal for handling user interactive input inside worker thread. https://github.com/NavidZ/input-for-workers

info above from a conference hosted by Microsoft recently.

deltakosh commented 5 years ago

@DevelopDaily : If you do not use images like in threejs demo it works for Babylon.js as well with the polyfill presented by @benaadams. If you want to support texture you have to remove all Images (DOM element) and use createImageBitmap which is not supported on Edge, IE and Safari.

We do not see it as high priority but maybe we are wrong :) Do you think we should prioritize it more? Why?

If we decide to go down that path here what we can do:

Have a flag at engine creation to switch in worker mode
Every image load will check if createImageBitmap is present to use it (else it will fail)
Activate Ben's polyfill

ixcviw7bw commented 5 years ago

Spector can capture OffscreenCanvases, but only those present in the main thread. It can't see anything happening in web workers. It can definitely be fixed. See issue BabylonJS/Spector.js#90

However it's still easy to log draw calls with this utility: https://github.com/vorg/webgl-debug

It wraps around GL context, just like Spector, and prints draw calls to console. So you need to create a context manually, wrap it with webgl-debug and pass it to Babylon.js or your other library of choice.

As far as I've tested, passing events to worker via postMessage() can produce input lag. I'm currently using SharedArrayBuffer to share pointer position etc. across all workers and main thread, and it works much better. Keyboard events could also be propagated to workers via SAB using an int queue. SAB read/write cycle should be synchronised using a mutex to avoid race conditions. How it should be done depends on whether you want to run a game loop on main thread too, or just keep everything in the worker and use the main thread to collect and send input events only.

In my scenario I'm not using OffscreenCanvas at all, but I'm rendering on main thread and I'm running a worker that runs all app logic (event handling, moving camera etc). The ideal solution in my scenario is running an infinite loop in the worker that executes logic and at the end of every cycle locks a mutex shared with main thread. The main thread is running requestAnimationFrame that updates the SAB with latest input data, reads computed data like camera position, and does an async unlock on mutex, then renders a frame. This results in decreased input latency and frame drops compared to just running two separate requestAnimationFrames, one in main thread and the other in worker. A major drawback is that the web worker can no longer use async APIs like Websockets or IndexedDB, since it is running an infinite synchronous loop. So to fix that, every few loops I'm exiting the loop with something like that:

setTimeout(loop, 0);
return;

So the JS engine can execute queued microtasks and then resume the loop. The loop is still synchronised with a mutex most of the time, so it maintains the benefit of lower input lag.

IMO rendering on the main thread + logic in the worker may be a better solution than the other way around.

sebavan commented 5 years ago

My current issue with the offscreencanvas is that I am not seeing how the perf would be that different if we still run our entire code in the worker. This is still one thread vs one thread + communication overhead for inputs/textures/sound...

I am probably totally wrong here ;-) but I am just trying to understand exactly what the gain could be by integrating offscreencanvas support. I would probably found it more appealing if the context could be shared across workers for instance.

@bilgorajskim I actually enjoy and totally understand the potential of your solution.

ixcviw7bw commented 5 years ago

I can't see any benefit of running everything in one worker either @sebavan . It makes sense when using main thread for logic, since it effectively gives 2x more room for computation, but as I said, this can be achieved the other way around with running logic in a worker and rendering on the main thread. It gives the same 2x performance, but no issues with Spector not working etc..

I can imagine people wanting to have some background jobs in web workers using OffscreenCanvas. For example a background job for downloading terrain elevation data from some source, parsing it on the CPU and using OffscreenCanvas to compute shadows with GPU. Perhaps there might be performance gain from executing it on OffscreenCanvas, although I'm unsure about that. Typed arrays could be transferred to the main thread at no cost, and I don't really know if splitting the two GPU tasks of shadow computing and scene rendering into OffscreenCanvas and the main thread canvas would yield any better performance than executing them serially over the same GL context. It might even be less performant. The second issue is that these are highly specialised tasks, and it would be easier to just write raw WebGL code to run such shader than to load the whole Babylon.js and force it to execute a very custom shader. There is no benefit for using Babylon.js in such case.

@sebavan

I would probably found it more appealing if the context could be shared across workers for instance

I wonder if WebGPU could support sharing the same context across web workers?

sebavan commented 5 years ago

I know that there are a lot of talks about multi-threading for WebGPU not sure of the outcome so far, but my guess is that it will be possible.

benaadams commented 5 years ago

but I am just trying to understand exactly what the gain could be by integrating offscreencanvas support.

Depends if everything is in WebGL or not. We have a lot of HTML UI which is above the webgl canvas so using offscreen canvas means the DOM is not impacted by the WebGL and viceversa. (We then also have another webworker that the offscreen canvas talks to that does networking/physics etc that the DOM thread doesn't talk to)

hjlld commented 5 years ago

The main stream of web is still dominated by DOM, not WebGL.

as my experience, the reasons why we can not often see webgl content in product environment or business app, most important is PM worrying about webgl content would decrease entire performance, make DOM update lag.

so offscreen feature is absolutely good for expanding webgl's coverage or usage, not only for gaming but also for every common web site.

and frameworks like react, angular, vue often provide their own life cycle or attach data observer to variables, they are annoying during webgl development, offscreen can avoid them easily.

DevelopDaily commented 5 years ago

@deltakosh

We do not see it as high priority but maybe we are wrong :) Do you think we should prioritize it more? Why?

I think you should, because that will open the door to a whole bunch of use cases, especially in scientific and engineering computing. For example, in computational fluid dynamics, the solution of a partial differential equation usually produces very large data sets of temperature/pressure/Mach over an airfoil. Without web worker capabilities, processing the data sets will have to block the browser UI thread. Consequently, the browser will not respond to the keyboard and mouse events -- the scrollbars will be frozen, until the rendering is done.

Our long term goal is to use the web worker mechanism to do parallel computing on a browser farm. For example, a master browser communicates with x number of slave browsers. The slaves do heavy lifting computation and the master assembles the results from the slaves and presents the final results.

benaadams commented 5 years ago

@DevelopDaily note: you can use webworkers and construct Buffers/VertexBuffers from SharedArrayBuffers (to minimise data passing) or use transferable ArrayBuffers quite happily, currently; both of which would reduce the amount of heavy processing on the WebGL/DOM thread.

What OffscreenCanvas allows is the WebGL render loop to move to a webworker, so it doesn't interact with the DOM thread.

Consequently, the browser will not respond to the keyboard and mouse events -- the scrollbars will be frozen, until the rendering is done.

Yeah, this is the problem. While you can move a lot of the CPU to other webworkers currently; if your shaders are heavy (e.g. not operating at 60fps, frame < 16ms) then it will currently jank the DOM; also if you are uploading lots of data to the GPU frequently (for updates) that will also.

Using the OffscreenCanvas means the jankyness of either the DOM and/or the WebGL rendering don't effect each other.

deltakosh commented 5 years ago

Ok sounds good movin it to 4.5. Depending on how long the Node Material takes me for 4.1 I may be able to work on this one for 4.1

deltakosh commented 4 years ago

Fixed by #6924

BabylonJS / Babylon.js

Web workers / Offscreen canvas / async APIs #5811

Feature request

Using createImageBitmap() in web workers