hushaojie04 / libgdx

Automatically exported from code.google.com/p/libgdx
0 stars 0 forks source link

GL 2.0 vs GL 1.0 TextButton rendering performance #654

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
I attached a patch for project gdx-tests with TextButtonTest.java, which 
creates 100 text buttons using skin from tests.

What steps will reproduce the problem?
1. Run TextButtonTest
2. When needsGL20 == true, then FPS drops
3. When needsGL20 == false, then FPS raises

Both GL 1.0 and GL 2.0 rendering should have aproximately the same FPS output.

When i run this test on my Samsung Galaxy S with GL 2.0 i get ~7FPS, with GL 
1.0 i get ~40FPS. When i run it on my desktop with 1000 buttons, i get 63FPS 
with GL 1.0, ~40FPS with GL 2.0

Original issue reported on code.google.com by ludovit....@gmail.com on 9 Jan 2012 at 7:44

Attachments:

GoogleCodeExporter commented 9 years ago
Erm, why do you send a patch? TextButtonTest is not in our source repository. 
I'll look into the performance issue, not sure i can reproduce it.

Original comment by badlogicgames on 9 Jan 2012 at 7:58

GoogleCodeExporter commented 9 years ago
Sorry, i shouldn't drink and work on issues on the tracker. Patch is super 
awesome, entering testing phase. Will report back.

Original comment by badlogicgames on 9 Jan 2012 at 8:03

GoogleCodeExporter commented 9 years ago
I sent the patch for your convenience. You can just apply it on your gdx-tests 
project and run the test, so you don't have to write any code ;) 

Original comment by ludovit....@gmail.com on 9 Jan 2012 at 8:04

GoogleCodeExporter commented 9 years ago
Results on Galaxy Nexus

GLES 2.0 enabled
01-09 21:05:12.217: I/X(19900): FPS: 11
01-09 21:05:12.217: I/X(19900): GL20: true
01-09 21:05:12.310: I/X(19900): FPS: 11
01-09 21:05:12.310: I/X(19900): GL20: true
01-09 21:05:12.404: I/X(19900): FPS: 11

GLES 1.x enabled

01-09 21:07:03.763: I/X(20003): FPS: 59
01-09 21:07:03.763: I/X(20003): GL20: false
01-09 21:07:03.802: I/X(20003): FPS: 59
01-09 21:07:03.802: I/X(20003): GL20: false
01-09 21:07:03.873: I/X(20003): FPS: 59
01-09 21:07:03.873: I/X(20003): GL20: false

There's definitely something up. I'll try to figure this out asap. Thanks for 
reporting.

Original comment by badlogicgames on 9 Jan 2012 at 8:07

GoogleCodeExporter commented 9 years ago
The problem seems to be limited to TextButton. Neither Label nor Button exhibit 
this performance issue. I'll try to narrow the problem down in TextButton asap. 
Thanks again for reporting.

Original comment by badlogicgames on 9 Jan 2012 at 8:31

GoogleCodeExporter commented 9 years ago
Yes, i have come to same conclusions. I have tested it it many variations, but 
didn't spot the problem. When i remove rendering of TextButton's 9patch 
background then it runs ok, when i enable drawing of just one part of 9patch, 
then performance goes down.

Original comment by ludovit....@gmail.com on 9 Jan 2012 at 8:35

GoogleCodeExporter commented 9 years ago
To kill performance it is enough to add for example 100x 2 images from 2 
different textures to stage and render stage. So when spritebatch swaps 
textures in GL 2.0, it is slower than in GL 1.0.

And since the label's bitmap font is in another texture than UI, it is the same 
case.

Original comment by ludovit....@gmail.com on 9 Jan 2012 at 11:42

GoogleCodeExporter commented 9 years ago
Yeah, i just came to the same conclusion, should have seen it from the start. 
The Label and nine-patch don't share the same texture, hence there's 2 texture 
binds per actor. 

The question is why binding textures is so much slower in GLES 2.0. I'm 
currently walking through all the involved code and coudln't find an answer 
yet. I guess i'll create a new test example that uses SpriteBatch directly.

In any case, reducing texture binds by using atlases is a must regardless. I 
still want to find out if it's me doing something silly so that texture binds 
are slower in GLES 2.0 or if that's just the way it is. 

Original comment by badlogicgames on 9 Jan 2012 at 11:47

GoogleCodeExporter commented 9 years ago
Well, i wrote a little test, independent of any scene2d stuff, just using 
textures, a mesh and a shader, see 
http://code.google.com/p/libgdx/source/browse/trunk/tests/gdx-tests/src/com/badl
ogic/gdx/tests/TextureBindTest.java.

You can toggle whether to use GLES 2 in line 97. The GLES 2 version performs 
considerably less well than the GLES 1 version. If you comment lines 79 and 80, 
which disables texture binds all together, you'll see that the thing has the 
exact same FPS no matter the GLES version used (30fps on my Nexus). 
Texture#bind() directly calls glBindTexture, so i'm 99% sure it's not a bug on 
our end. I'll ask around on the imgtec forums. This smells like a driver issue 
to me.

We can't really do anything about this at the moment. I optimized ShaderProgram 
and SpriteBatch a tiny little bit while trying to figure out the issue. Apart 
from that, there's nothing i can do really. If you don't mind, i'll set this to 
WontFix, if i get new info i'll update the issue.

Texture atlases seem to be the only workaround for now.

Original comment by badlogicgames on 10 Jan 2012 at 1:26

GoogleCodeExporter commented 9 years ago
Reopened. It behaves similarily on the desktop. I'd assume it's the shader, 
other than that i'd have no idea what could cause this. More research needed.

Original comment by badlogicgames on 10 Jan 2012 at 2:24

GoogleCodeExporter commented 9 years ago
I checked how AndEngine implements SpriteBatch. It forces one to use the same 
Texture as binded in SpriteBatch. When one tries to render TextureRegion with 
different Texture, then it throws Exception. It seem that GL 2.0 binding is 
indeed more heavy than in GL 1.0. In this case one has to optimise his drawing 
code so that he uses just one Texture for SpriteBatch.

One enhancement regarding libGDX is to support Skin with bitmap font in the 
same Texture as other stuff from UI.

Original comment by ludovit....@gmail.com on 10 Jan 2012 at 9:03

GoogleCodeExporter commented 9 years ago
We support BitmapFonts in the same texture atlas of the skin. The assets we use 
for the test don't show that though. I'll ask Nate to update that. It's also a 
reason why we never caught that issue as benchmarking was performed with such 
an optimal texture atlas.

Original comment by badlogicgames on 10 Jan 2012 at 4:31

GoogleCodeExporter commented 9 years ago
I tested this case yet more, and i came to conclusion that binding is not the 
problem. I added 10x bind and no significant performance penalty observed. 

I eventually removed binding of my textures altogether. Still had 30 FPS with 
my simple stage with GL 2.0. 56 FPS with GL 1.0.

Original comment by ludovit....@gmail.com on 10 Jan 2012 at 7:39

GoogleCodeExporter commented 9 years ago
Did you use my TextureBindTest or something else? It's easier to narrow down 
with the simple test i wrote. Removing the lines 79 and 80 brings the GLES 2 
version to the same performance as the GLES 1 version.

Original comment by badlogicgames on 10 Jan 2012 at 7:42

GoogleCodeExporter commented 9 years ago
I used my stage and modified GDX so it's hard to show you. 
Actually, TextureBindTest proves that there is no 'significant' impact when 
binding with GL 2.0. 

I have 7FPS with GL20, 8 FPS with GL10 and binding enabled.
I have 13 FPS with GL20 and the same with GL10 and binding disabled.

TextureBindTest should show significant FPS drop when switching from GL10 to 
GL20.

You can create test case when you remove all glBind from Texture and run 
TextButtonTest. Even if no texture binds occur when rendering, there is 
difference between GL10 and GL20.

I think that this performance problem might be caused by SpriteBatch and 
renderMesh.
Is calling of batch.end() batch.begin() considered to be heavy operation? It 
causes flushing and rendering the mesh. In my simple scene it gets called about 
12 times and i observe FPS drop to 30 on GL 2.0, 56 on GL 1.0

Original comment by ludovit....@gmail.com on 10 Jan 2012 at 8:33

GoogleCodeExporter commented 9 years ago
It's strange that your findings differ from mine. As i outlined above, the bind 
actually kills the performance in TextureBindTest. 

SpriteBatch#begin()/end() is a somewhat heavy operation, but we keep state 
changes to an absolute minimum in both GLES 1 and GLES 2.

Original comment by badlogicgames on 10 Jan 2012 at 8:36

GoogleCodeExporter commented 9 years ago
I don't say that binding doesn't kill performance. I am saying that there is no 
difference when changing from GL10 to GL20 in that test. Tried it one more time 
with results:
2500 textures and binding enabled:
GL20 25 FPS
GL10 25 FPS

Original comment by ludovit....@gmail.com on 10 Jan 2012 at 8:45

GoogleCodeExporter commented 9 years ago
My results for 5000 rectangles (switching between 2 textures, LWJGL backend):

GL20 17fps
GL10 26fps

This is indeed very strange.

Original comment by badlogicgames on 10 Jan 2012 at 8:48

GoogleCodeExporter commented 9 years ago
Welp, we boiled the issue down (thanks Ludevik!).

The difference arises from the use of VBOs in GLES 2.0. Updating a VBO while it 
is still being used for the last rendering batch decreases performance as it 
appearantly stalls the GPU until the last rendering call is complete. The 
situation can be helped a little by using multiple VBOs. The performance is 
still not on par with GLES 1.0, but better.

Note that this issue is actually about a corner case were you change the 
texture for each sprite! This is a degenerate case and you should definitely 
avoid changing textures that often. Under normal conditions (when the batcher 
can actually batch your sprites) the GLES 2 paths of SpriteBatch will perform 
as well as the GLES 1 paths.

Moral of the story: use texture atlases.

Original comment by badlogicgames on 10 Jan 2012 at 11:55

GoogleCodeExporter commented 9 years ago
I have elaborated more on the performance in GL 20. Problem was identified 
correctly in the VBO. The more calls to glDrawElements/glDrawArrays mesh does, 
the lower FPS.

It seems that i am able to call max 30 glDrawElements on Samsung Galaxy S, at 
40 i am observing FPS drops, and that is with 3 vertices and 3 indices, and 
with very primitive shader (same as used in HelloTriangle). I think that it is 
too few. With simple scene, with only 16 renderMesh calls i have 30FPS on the 
phone. GL10 handles that number of renderMesh calls easily. It will be hard to 
optimise drawing to minimise drawing of meshes, especially with UI. Simple UI 
like the one in UITest runs at 30 FPS on my phone. 30 FPS might look like 
enough, but the animations are not very smooth at that FPS count.

I checked AndEngine how it renders the Sprite class. It also calls 
glDrawArrays. So i edited the SpriteExample to add 100 sprites. Each drawing of 
sprite calls glDrawArrays. 100 sprites run at 56 FPS, 500 sprites run at 35 FPS.

Whan can be causing this difference?

Original comment by ludovit....@gmail.com on 12 Jan 2012 at 12:04

GoogleCodeExporter commented 9 years ago
That's because Andengine doesn't alter the mesh but reuses it and instead sets 
a uniform. Please use an atlas, it will solve our problem.

Original comment by badlogicgames on 12 Jan 2012 at 12:09

GoogleCodeExporter commented 9 years ago
Sorry to interrupt, but is there a test/demo/example somewhere that shows how 
to use fonts and skins in the same texture atlas? Currently using scene2d.ui 
fonts are loaded automatically from the skin style.

Original comment by me.thc...@gmail.com on 12 Jan 2012 at 1:21

GoogleCodeExporter commented 9 years ago
I have found just one skin: uiskin.json. Have not found other one, but maybe im 
just blind ;)

Original comment by ludovit....@gmail.com on 12 Jan 2012 at 7:52

GoogleCodeExporter commented 9 years ago
I has boiled this GL 2.0 performance problem down to fragment shader. If i 
modify MeshShaderTest to render the Mesh 100x, and i modify its fragment shader 
so it is:
gl_fragColor = vec4(1,1,1,1);

it runs fast, when i change it to anything that is evaluated at runtime, like:
gl_fragColor = v_color;

then my FPS drops significantly. So every glDrawArrays with that shader has 
huge performance impact.
Is there anything that i can do, besides limiting the 
glDrawArrays/glDrawElements calls?

Original comment by ludovit....@gmail.com on 16 Jan 2012 at 1:35

GoogleCodeExporter commented 9 years ago
I have found this:
http://stackoverflow.com/questions/5508527/why-does-a-single-vec4-multiplication
-slow-down-my-ogl-es-2-fragment-shader-so-m

It is somehow possible that in fragment shader in SpriteBatch:
#ifdef GL_ES
#define LOWP lowp
precision mediump float;
#else
#define LOWP
#endif
varying LOWP vec4 v_color;

- v_color uses other than lowp precision?
- has use of varying vs uniform any impact on performance?

Original comment by ludovit....@gmail.com on 16 Jan 2012 at 2:27

GoogleCodeExporter commented 9 years ago
I made some tests with different shaders, no issues found. Please ignore 
previous comments ;)

Original comment by ludovit....@gmail.com on 16 Jan 2012 at 6:21