Closed misl6 closed 1 month ago
Consider n → triangles count, for 360
triangles:
GL_TRIANGLE_FAN
= n + 2 = 360 + 2 = 362 verticesGL_TRIANGLES
= 3 n = 3 360 = 1080 verticesIt is interesting that even though it requires more vertices to compute, you have found a way to optimize the code, by using indexed GL_TRIANGLES
.
Do you think that the extra amount of memory allocated could be causing a small drop in performance on some platforms, or could it have to do with something else?
Consider n → triangles count, for
360
triangles:
GL_TRIANGLE_FAN
= n + 2 = 360 + 2 = 362 verticesGL_TRIANGLES
= 3 n = 3 360 = 1080 verticesIt is interesting that even though it requires more vertices to compute, you have found a way to optimize the code, by using indexed
GL_TRIANGLES
.
Vertices are re-used to compute triangles with indexed triangles, so:
GL_TRIANGLE_FAN
= n + 2 = 360 + 2 = 362 verticesGL_TRIANGLES
with same start and end point = n + 1 = 361 verticesGL_TRIANGLES
with different star and end point = n + 2 = 362 verticesOnly the indices for GL_TRIANGLES
are 1080.
Do you think that the extra amount of memory allocated could be causing a small drop in performance on some platforms, or could it have to do with something else?
We can't be sure, and should be measured, but certainly allocating more memory does not help. (Will move this discussion in #8664, as we may try to make some performance tweaks while fixing the issue)
Merging, as should not break anything, but let's keep performance monitored (after #8664 changes )
Maintainer merge checklist
Component: xxx
label.api-deprecation
orapi-break
label.release-highlight
label to be highlighted in release notes.versionadded
,versionchanged
as needed.After adding
ANGLE
support for iOS, I noticed a small performance boost on some sides, but I was also starting to be affected by a huge performance drop when drawing complex UIs (so I noticed it only after the actual merge).It wasn't clear initially what the actual issue was, as in a complex UI a lot of things happen, and none of the simple reproducible examples I created showed such a huge performance drop.
The nice thing is that the whole process made me aware of the potential improvements we can make in the future (See: #8664 )
But, finally, we found the root cause for this performance drop, and it's related to usage of
GL_TRIANGLE_FAN
.GL_TRIANGLE_FAN
primitive is used by Kivy graphics byEllipse
andRoundedRectangle
objects.Unfortunately, when it comes to
Metal
orDirectX>=10
, triangle fans are not natively supported, and in the case ofANGLE
this missing primitive is emulated and therefore is incredibly slow (at least on iOS) (See: https://bugs.webkit.org/show_bug.cgi?id=237533)This PR switches from
GL_TRIANGLE_FAN
to indexedGL_TRIANGLES
for drawing ellipses. The changes are kept intentionally minimal to follow an incremental path, but the whole code can be improved to increase efficiency. (Again, as an example see: #8664 )An additional PR will take care of
RoundedRectangle
, even if here the performance drop seems to be less visibile.Indexed
GL_TRIANGLES
are on an OpenGL side, as much as fast as aGL_TRIANGLE_FAN
, but the need of allocating (improvable, as we need to allocate it only when segments orangle_start
/angle_end
are changed) more memory for the index, can make it slightly slower on certain platforms.The following example has been used to stress the
Ellipse
drawing:GL_TRIANGLE_FAN
GL_TRIANGLES
(indexed)As we can see, the change is not only beneficial to platforms backed by ANGLE, but also on Android and Windows which still rely on OpenGLES / OpenGL.
On Ubuntu, at least on my configuration,
GL_TRIANGLE_FAN
is slightly faster, but I'm quite sure that with additional optimizations (See above), we can reach even better fps.Some screenshots and videos:
iOS before:
https://github.com/kivy/kivy/assets/8177736/175ea4f7-b0ec-4fd6-814f-31ad77ba726b
iOS after::
https://github.com/kivy/kivy/assets/8177736/29306cb6-c146-4f83-ab02-b7188278eb73