hex539 / scoreboard

Online judge scoreboard parser
GNU General Public License v3.0
23 stars 4 forks source link

Optimise use of OpenGL #15

Open hex539 opened 4 years ago

hex539 commented 4 years ago

Performance is OK when running as a standalone desktop app connected to HDMI. Still, things get a little bit choppy if casting the screen to another device at the same time, and this is one of the main use cases for the program.

flame

On an i7-6770HQ iwth software rendering, the desktop app can do 4K60fps but the above flame graph shows that it spends a lot more time than it should rendering fonts and using OpenGL 1.1 immediate mode APIs.

Next steps are to:

  1. Add a benchmark target for rendering N frames and then exiting.
  2. Switch all of the glBegin/glEnd usage into VBOs.
hex539 commented 4 years ago

After optimising out the glEnable/glDisable calls, the GPU trace tool shows us the following time breakdown per 18ms frame, with and without particles enabled, respectively:

image

Calls highlighted in blue are glVertex2d (average time per call 147ns). If we go to the worst case for particles there are typically about 30k extra vertices to draw, which take about 9ms out of the 16ms budget (4.5ms from glVertex, 4.5ms from glColor).

Then we have about 5k other vertices being drawn with glVertex2d/glVertex2f and fewer glColor calls in 6ms, which typically we just barely meet.

So it makes sense to start by putting the particles in a vertex buffer and trace again to see if the impact is reduced enough by doing that.