mohammedari / opengl_ros

A simple implementation to utilize OpenGL shader (GLSL) code with a GPU-powered machine under ROS environment
MIT License
10 stars 2 forks source link

Speeding up rendering #4

Open DanielArnett opened 4 years ago

DanielArnett commented 4 years ago

I've been profiling the renderer and trying to figure out how to speed it up. I'm working on 4k images so I'm trying to reduce rendering time by any means. Currently the rendering is spending essentially the entire time on the three operations below.

30% Writing the texture to the GPU 30% Drawing elements I assume this is the shader running? 40% Reading the texture from the GPU

I can figure out threading and the ROS part but I don't know much about OpenGL. Is it possible to do any of these 3 steps in parallel? Can OpenGL read and write textures at the same time? Or maybe while one texture is being rendered can I buffer the next texture? Or maybe I could use multiple framebuffers? I'm not sure about how OpenGL handles this.

Right now I'm getting 10 frames per second (fps). If I could just do 2 of these steps in parallel I could get to 15fps. If I could get all 3 in parallel that will push me to 25 fps.

Thank you!

mohammedari commented 4 years ago

Hello!

Yes, the measured time of glDrawElements is the time waiting for the completion of the rendering of the current frame, and all these three steps can be shortened if CPU and GPU can run asynchronously. If you need to write/read textures asynchronously, OpenGL provides an asynchronous data transfer method called Pixel Buffer Object, but unfortunately, I am not familiar with that.

http://www.songho.ca/opengl/gl_pbo.html

You can prepare multiple PBOs and write and read data with glWrite/ReadPixels API instead of glTextureSubImage2D/glGetTextureImage API, but then you also have to sync CPU and GPU by yourself using Fence object instead of glFinish API currently does, so that CPU can publish the previous frame while GPU renders the current frame.

DanielArnett commented 4 years ago

Wonderful, I'll look into that, thank you!

On Thu, Jan 2, 2020, 6:17 AM Kazuyuki Arimatsu notifications@github.com wrote:

Hello!

Yes, the measured time of glDrawElements is the time waiting for the completion of the rendering of the current frame, and all these three steps can be shortened. If you need to write/read textures asynchronously, OpenGL provides an asynchronous data transfer method called Pixel Buffer Object, but unfortunately, I am not familiar with that.

http://www.songho.ca/opengl/gl_pbo.html

You can prepare multiple PBOs and write and read data with glWrite/ReadPixels API instead of glTextureSubImage2D/glGetTextureImage API, but then you also have to sync CPU and GPU by yourself using Fence object instead of glFinish API currently does, so that CPU can publish the previous frame while GPU renders the current frame.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/mohammedari/opengl_ros/issues/4?email_source=notifications&email_token=AAXDT7WQLD6ZZUUKKGLIZOTQ3XEL7A5CNFSM4KBWICB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEH6EBRI#issuecomment-570179781, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXDT7TAUORE7ZSUACZUE3DQ3XEL7ANCNFSM4KBWICBQ .