cocos2d / cocos2d-x

Cocos2d-x is a suite of open-source, cross-platform, game-development tools utilized by millions of developers across the globe. Its core has evolved to serve as the foundation for Cocos Creator 1.x & 2.x.
https://www.cocos.com/en/cocos2d-x
18.21k stars 7.06k forks source link

CCClippingNode causes bad performance on windows phone 8.1/ windows 10 mobile device #16685

Open flowerfx opened 8 years ago

flowerfx commented 8 years ago

Steps to Reproduce:

  1. Run the hello world sample and add CLippingNode into the Scene
  2. Look the fps at bottom- left of the screen , when i have 1 clippingnode, the fps have around 20fps, 2 or more clippingnodes can lead the fps under 10
DavidDeSimone commented 8 years ago

I've done clipping node performance testing in the past, and I believe the performance issues you are seeing are due to the fact we are reading values from the GPU during clipping node's draw functions.

If you examine CCStencilStateManager.cpp::140, we see

glGetIntegerv(GL_STENCIL_WRITEMASK, (GLint *)&_currentStencilWriteMask);
glGetIntegerv(GL_STENCIL_FUNC, (GLint *)&_currentStencilFunc);
glGetIntegerv(GL_STENCIL_REF, &_currentStencilRef);
glGetIntegerv(GL_STENCIL_VALUE_MASK, (GLint *)&_currentStencilValueMask);
glGetIntegerv(GL_STENCIL_FAIL, (GLint *)&_currentStencilFail);
glGetIntegerv(GL_STENCIL_PASS_DEPTH_FAIL, (GLint *)&_currentStencilPassDepthFail);
glGetIntegerv(GL_STENCIL_PASS_DEPTH_PASS, (GLint *)&_currentStencilPassDepthPass);

glGetIntergerv has extremely negative performance implications. It is listed as a common mistake on https://www.opengl.org/wiki/Common_Mistakes in OpenGL programming. It says:

You find that these functions are slow.

That's normal. Any function of the glGet form will likely be slow. nVidia and ATI/AMD recommend that you avoid them. The GL driver (and also the GPU) prefer to receive information in the up direction. You can avoid all glGet calls if you track the information yourself.

What I don't understand is why we are fetching these values from GPU memory when they are driven and set from CPU computation? I don't see a reason why when we set these values in GPU memory, we also set them in the StencilManager, rather then setting them in the stencil manager and then having the StencilManager read them back out of the GPU.

I wrote a version of clipping node in cocos2d-x v2 that avoided this performance penalty by doing an approach similar to what I've described above. It won't work in cocos2d-x v3's new rendering pipeline, but I am more then willing to rewrite it for v3 if the engine maintainers see benefit (unless there is something that I am fundamentally misunderstanding).

flowerfx commented 8 years ago

hello i think i have found the problem of this issue and all the problem slow down the wp8.1/wp10 platform (use stencil)

i notice that this function : glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(_indices[0]) * _filledIndex, _indices, GL_STATIC_DRAW); in CCRenderer.cpp run slow. I have tracked total time run of this function in one loop, it took about 0.05 to 0.1 sec with Scene have two or more node use clipping(stencil) mean that the game fps drop to ~ 10 fps.

Then i change the code into glBufferData(GL_ELEMENT_ARRAY_BUFFER, sizeof(_indices[0]) * _filledIndex, _indices, GL_DYNAMIC_DRAW); the fps grow up to ~ 30fps , maybe the cocos team should check this problem in render system on wp8.1/wp10

thanks