Question: caching gpu commands

psulat commented 11 years ago

Caching of shaders is providing a significant benefit. I was wondering if you investigated the performance impact of caching all of the GL calls generated by GLOW? I often call the same sequence of shaders with just a parameter change. I'm wondering how much of the javascript overhead is stalling the GPU. We could cache all of the GL commands similar to the way webgl-debug detects all of the GL calls. We would just need to update a uniform variable and then call the functions in the "command queue" cache. I'm thinking of using GPUView to determine how often the GPU stalls because of Javascript. Your thoughts?

empaempa commented 11 years ago

I think I'm 100% sure what you'd like to achieve (no unnecessary GL calls) but I don't follow how you'd like to achieve it - please elaborate :)

psulat commented 11 years ago

Mikael,

Sure, I’ll elaborate. My goal is not related to unnecessary GL calls (sorry that was a different discussion). My goal is to reduce the amount of Javascript that needs to be executed to generate the GL calls needed by the GPU. I suspect that the GPU is idle a significant amount of time because Javascript can’t keep the command queue in the GPU full. GLOW has a certain amount of overhead to create all of the necessary GL calls. I have since added code on top of GLOW that further abstracts away from the GPU, which is definitely contributing to the problem. Our code, GLOW and my abstraction code, is great for development but I have a feeling that all that Javascript is slowing down the GPU. I often have use cases where the same series of shaders are called, just with different uniform values. I’m thinking that that the first time we call the this sequence, we cache all of the GL commands in a array. If the exact same sequence of shaders is called next with different uniform variable(s), we can just update the uniform and then call everything in sequence in the array. Very little javascript will be needed for this.

If properly designed this would give us a high level abstraction for development, without losing high performance low level calls. Our code is similar to a high level language that is getting compiled for every execution. I would like to save the compilation for reuse.

You may not have a use case that this really benefits from this. I’m mainly doing 2D processing and chaining many shaders together, where this will probably be much more helpful than 3D.

Pete

From: Mikael Emtinger [mailto:notifications@github.com] Sent: Tuesday, October 30, 2012 4:49 PM To: empaempa/GLOW Cc: Peter Sulatycke Subject: Re: [GLOW] Question: caching gpu commands (#22)

I think I'm 100% sure what you'd like to achieve (no unnecessary GL calls) but I don't follow how you'd like to achieve it - please elaborate :)

— Reply to this email directly or view it on GitHub https://github.com/empaempa/GLOW/issues/22#issuecomment-9924249 .

https://github.com/notifications/beacon/YDEHeoFfjnwuO8VXMmsexdQ5dM0QEuJfcAqot6RjFVdY5HrytDOB1dO88qGrXoTN.gif

empaempa commented 11 years ago

Ah, right, now I'm with you :)

I think it's a great idea, just trying to figure out the best way to go about it. As you say, this is something you like to have when you do the same thing over and over (which is pretty common even in 3D)

A side note: when I misunderstood what you wanted to do, I imagined emulating the GL in JS and only pass on necessary instructions to the actual GL. Now, this wasn't really what you fished for, but could sort of be a way forward - at least conceptually.

Imagine you start "recording", call your shaders, call stuff on the context (like setting viewport) and other direct GL calls, then you stop and save the recording and can playback it later - this time with minimal JS overhead. I guess one way would be to emulate the GL and save all incoming instructions during recording, which to playback later.

Do you have any ideas?

psulat commented 11 years ago

I haven’t given the design too much thought, I just knew that I wanted to do this to improve performance. It should have a big impact – can’t get any faster. I’m glad you like the idea. I’m grateful for the code you provided and I’m happy to contribute.

I’m still new to Javascript (just a couple of months), so forgive me if I make incorrect assumptions.

I first thought of the idea when I saw webgl-debug.js capture all of the GL commands and display them on the console. I thought we could use the same method to capture what GL calls are made or we can create our own wrapper. The wrapper is probably better since we have more control and it will be more efficient. Inside the wrapper we can call the normal GL calls and save the calls into an array or a JSON string. Now how do we correctly update the uniforms in the cached command sequence that was captured/recorded? If we use an array we should be able to index directly in and replace the uniform call. If JSON we will have to search the string. One additional benefit of recording the commands, we can scan the array/string and remove any redundant calls. These redundant calls do take up a lot of time. It sounds like an array will work best.

Now we need to add some type of control logic that will automatically detect when a cached command sequence can be used and make the appropriate call. I need to give this some thought. I’m sure I’m missing a lot of details. In either case, I’m sure it can be done.

From: Mikael Emtinger [mailto:notifications@github.com] Sent: Wednesday, October 31, 2012 3:07 AM To: empaempa/GLOW Cc: Peter Sulatycke Subject: Re: [GLOW] Question: caching gpu commands (#22)

Ah, right, now I'm with you :)

I think it's a great idea, just trying to figure out the best way to go about it. As you say, this is something you like to have when you do the same thing over and over (which is pretty common even in 3D)

A side note: when I misunderstood what you wanted to do, I imagined emulating the GL in JS and only pass on necessary instructions to the actual GL. Now, this wasn't really what you fished for, but could sort of be a way forward - at least conceptually.

Imagine you start "recording", call your shaders, call stuff on the context (like setting viewport) and other direct GL calls, then you stop and save the recording and can playback it later - this time with minimal JS overhead. I guess one way would be to emulate the GL and save all incoming instructions during recording, which to playback later.

Do you have any ideas?

— Reply to this email directly or view it on GitHub https://github.com/empaempa/GLOW/issues/22#issuecomment-9936090 .

https://github.com/notifications/beacon/YDEHeoFfjnwuO8VXMmsexb_jftvDS13VzLHGC0oLshcgnCaH_AaEc1SdEk5GTseL.gif

empaempa commented 11 years ago

It's no small feat, but very doable. I think the basic assumption should be to hijack the context (the global GL-variable that GLOW creates) and store all function calls that's being done. Let's call this hijacking object for Recorder. If we do the recorder "the right way", all references to the GLOW.Uniforms will be lost (as only the value of the uniform is passed), so a bit of work need to go into GLOW.Uniform as well, somehow passing in itself as a reference. I'm not in a position to test right now, but maybe...

GL.uniform1iv( this.location, this.getNativeValue());

...in Uniform.js could be switch to...

GL.uniform1iv( this.location, this.getNativeValue(), this );

...and the Recorder would use the third parameter to store the reference. Not sure the real GL accepts having the extra parameter hanging there (need to try all browsers). If not, we could see to that all uniforms change their .load function when recording and change it back when stopping.

Storing could be an array with objects. Each object simply contains the GL-function and the parameters stored in an array. The playback simply iterates over the array and calls the function with the right parameters...

for( var i = 0; i < instructions.length; i++ ) { instructions[ i ].function.apply( GL, instructions[ i ].parameters ); }

...or something. Obviously the .parameters-array in the loop above needs to be updated for all uniforms before going into the loop, but that shouldn't be too hard to figure out.

When you stop recording, as you say, an optimizer should be run and remove all unnecessary calls (but let's not think about that until the basics works). I think it's really important to keep reference to the GLOW.Uniform so you modify the values like you do now, or there'll be another layer of complexity added.

In the best of worlds you simply do in your init...

Recorder.start( "someName" ); myShaderA.draw(); myShaderB.draw(); Recorder.stop();

...and in your loop...

myShaderAUniform.data.set( someValueA ); myShaderBUniform.data.set( someValueB ); Recorder.play( "someName" );

Actually, thinking of it, the recorder could be part of the Cache.js. Maybe :)

empaempa / GLOW

Question: caching gpu commands #22