WebGLSamples / WebGL2Samples

Short and easy to understand samples demonstrating WebGL 2 features
Other
1.02k stars 143 forks source link

Simplify transform feedback sample #147

Closed tsherif closed 7 years ago

tsherif commented 7 years ago

There were two things about the original sample that I thought were confusing:

  1. Splitting the rendering into separate "update" and "draw" passes is not necessary with transform feedback. You can draw and update in the same draw call.
  2. VAOs store pointer and "enabled" state of attributes, so it isn't necessary to set up the vertex attributes after binding the VAO each frame.

Removing these redundancies reduces the size of the sample by about 20% and I think makes it clearer how things are working. It also highlights some advantages of WebGL 2, since neither simplification would be possible in WebGL 1.

kenrussell commented 7 years ago

Awesome work Tarek. It's a great observation that the simulation and draw steps can be combined. I personally would not have thought of that.

In thinking about how to apply this to https://github.com/toji/webgl2-particles-2 I realized the optimization is quite subtle. In order to display the current frame's simulation results, as opposed to the previous frame's, it's necessary to derive gl_Position in the vertex shader from the most recently computed value. In fact, I think in your update, this issue is present in the vs-draw vertex shader: gl_Position is derived from a_position rather than v_position. Is this correct?

I think it would be worth displaying both of these examples side-by-side, with a bunch of comments indicating the thought process that went into combining the update and draw phases. What do you think?

tsherif commented 7 years ago

You're absolutely right, Ken. Forgot to fix that when copy-pasting things around. Furthermore, you made me realize that the shader could be simplified even further since the lifetime check before setting gl_Position became redundant. Fixed both in the latest commit.

Are there cases when you'd want to separate the two passes? That's what I found confusing about the original sample; it seemed to indicate that transform feedback had be done in that way. Can you think of a simple example where the separate passes would be beneficial? I'd be happy to implement it, and we could compare/contrast with this one to show the pros and cons of both techniques.

shrekshao commented 7 years ago

Thanks @tsherif and @kenrussell for your review. I think one case where you still have to keep two steps (simulation and draw) might be when you try to mimic the behavior of a geometry shader together with transform feedback. Say you are writing a flock simulation and after the transformfeedback, you want to use the a_position as an input attribute to help build a translation matrix in the vs and draw some actual geometry (e.g. a bird) for each point (Probably working together with draw instance) FYI, the three.js gpgpu bird [Code] example might help illustrating this idea. (Not a perfect example for this though, cuz it's using a fragment shader to fake the geometry). Let me know how you think about it. Also I made some comments for trivial changes in the code

tsherif commented 7 years ago

@shrekshao Good call. I removed all the unbinding calls and pulled a few other things out of the loop.

shrekshao commented 7 years ago

I did a simple profiling using stats.js with 500000 particles. No obvious FPS difference though (~47fps). I'm good with merging this. It's neat and clean. Also it removes the unnecessary vertex attribute setup. We also have another transform feedback sample using two path (the one using transform feedback output as color)

kenrussell commented 7 years ago

@shrekshao good analysis.

@tsherif Upon further thought, any time the results of the transform feedback will be instanced, this single-pass technique probably won't work. See https://github.com/toji/webgl2-crowd for an example.

Sounds good to me to merge this however. @shrekshao since you did the most thorough review, please do the honors. :)

tsherif commented 7 years ago

Thanks, guys.

@shrekshao Yeah, I'm getting similar results on both my Quadro M1000 and integrated Intel GPUs. Kind of surprised me, but I guess with the rasterization discarded and simple draw step in the original, the processing wouldn't be too different.

@kenrussell Right, that makes sense. I could imagine a simple example showing that. Same basic particle system but with a small set of triangles being rotated by transform feedback the first pass and then instanced and translated in a second pass. Let me see if I can put something together...