kdudka / rrv

Radiosity Renderer and Visualizer
GNU General Public License v3.0
12 stars 1 forks source link

porting to modern OpenGL #2

Closed claudeha closed 8 years ago

claudeha commented 9 years ago

rrv-compute feels pretty slow - for the room2.xml example with default options it's using 100% of one CPU core in step 1 of 32 averaging around 5.7 patches per second. I suspect throughput could be dramatically improved by using modern OpenGL features:

I'll have a go at implementing these soon, time permitting.

kdudka commented 9 years ago

Really? This sounds pretty bad. The room4 scene, which is even more complex, was computed in two hours or so on a 10 years old hardware. Any improvements regarding the performance would be really appreciated!

claudeha commented 9 years ago

I added vertex buffers to FormFactorEngine in a rather hacky fashion (uploading patch data depends on the patch count changing, I haven't read much of the code base to know whether this is appropriate). This improved performance from 5.7 to 8.2 patches per second. But while adding GLEW support I noticed a debug variable defaulting to ON in the build system.... See #3. With debugging off, the speedup from my changes was even more pronounced (from 32.3 patches per second for the original code to 74.5 patches per second with vertex buffers).

claudeha commented 9 years ago

Here's my current set of optimisations:

Keeping compatibility with ancient GL versions in a legacy branch for GL1, and the master branch having two programs, one for GL2 (yours with my patches) and one for GL3 (my rewrite), seems sensible perhaps? I'll try to prepare some pull requests soon, my local code is in a bit of a mess at the momemt...

kdudka commented 9 years ago

Thank you for working on this! To be honest, I am totally lost in the new versions of OpenGL. As long as the required libraries are available on modern Linux (and optionally Windows) distributions and run on commonly available hardware, I am fine to support just OpenGL 3.3+.

The screenshot looks pretty good. Given the fact that it was computed much faster, I really like the way your development goes!

claudeha commented 9 years ago

https://www.opengl.org/wiki/History_of_OpenGL is my go-to page for figuring out which GL version to target. The general trend is making the separation between client (CPU) and server (GPU) more obvious, making it harder to write poor-performing code like glBegin() etc. My 2005 laptop goes to GL 2.1 (plus some extensions that were promoted to core in GL 3), my 2009 laptop does GL 3.3, 2012 desktop (which I'm testing with) GL 4.5. I'm using the proprietary NVIDIA driver, though the Free drivers are improving all the time (my 2005 laptop works better with the radeon driver than ATI fglrx ever did...).

You can determine OpenGL version on X11 systems by:

$ glxinfo | grep OpenGL\ version

With modern GL you need a loader library to get function pointers from the implementation - GLEW is fairly common but has some issues, so I might switch to GLAD and add the generated code for just the functions needed to the repo. I'm using cross-platform GLFW to create a window with a GL context, even had some success cross-compiling to Windows (which I don't have myself, so I tested with WINE). Still need to add proper detection to the cmake system, for now I just hardcode a couple of -l flags in the new rrv-compute-gl3 executable stanza.

Anyway, I put my work in progress here: http://code.mathr.co.uk/rrv/shortlog/refs/heads/claude (if you prefer, I can do a github PR).

kdudka commented 9 years ago

Sorry for the delay! I tried to update my software to get OpenGL 3.3 working but according to the glxinfo command above, my system only supports OpenGL 3.0:

$ glxinfo | grep OpenGL\ version
OpenGL version string: 3.0 Mesa 10.6.2

At this point, I am not sure whether it is HW or SW limitation. I have NVIDIA Corporation G86 [GeForce 8500 GT] (rev a1) graphic card (according to lspci) and Gentoo Linux using the nouveau driver. I will try the proprietary driver next week to see if it works any better. With my current configuration, rrv-compute-gl3 seems to work fine but the resulting scene is black (as if it was not computed at all).

While comparing the performance with the old version, I realized that current master is badly broken by an experiment with adaptive division. So I have reverted the incompetently applied patch (see fb6725bd) and rebased your work on top of it (now in tmp branch -- kdudka/rrv@3b45a25a). Thanks to the revert, the old version performs much better if it is given enough RAM. It caches already computed form factors so that they do not need to be computed again for the subsequent steps. This can bring us additional overall speed-up when computing e.g. 32 steps. Do you think we could use such optimization with rrv-compute-gl3, too?

claudeha commented 9 years ago

Wow, thanks for the fix! I did wonder a bit why it kept computing form factors and caching them to use them just once... It's not really applicable to the rrv-compute-gl3 version, which just pushes pixels around instead of computing a cache/matrix - brute force stupidity really....

I added many more optimisations and things to my branch (merging your tmp in the process), now rrv-compute requires OpenGL 2.1 (2006), moreover room4.xml which rrv-compute-gl3 takes 12mins is now done by rrv-compute in 1m39s (albeit using 7.5x more memory than the gl3 version). Details here: http://code.mathr.co.uk/rrv/log/refs/heads/claude

kdudka commented 9 years ago

Sorry for the additional delay! I ran into dependency hell. My older NVidia chip (G86) forced me to use an older version of the proprietary nVidia driver, which required an older Linux kernel to compile and load. Finally, I got OpenGL 3.3 and consequently rrv-compute-gl3 running. Please give me a few more days to merge it.

claudeha commented 9 years ago

I'm so sorry for the dependency hell :( I hope you don't have to endure another round migrating back to the free driver - it turned out that I didn't need anything significant beyond 3.0, which I should have been much more vocal about instead of leaving it hidden in git history :(

kdudka commented 9 years ago

Sorry for being late on this! I have finally found some time to have a look at that. You can find a preview in the temporary branch claude:

https://github.com/kdudka/rrv/compare/claude

Are you fine with merging it into the master branch?

claudeha commented 8 years ago

sorry for delay, life got in the way - yes it's fine to merge with master! I tested briefly on my laptop (room2 with a large cache) and it seems to work fine.

kdudka commented 8 years ago

No problem with delay at all. Hopefully you are fine now. I have merged it into the master branch: https://github.com/kdudka/rrv/compare/fb6725bdeb...0923925766

Thank you very much for the contribution!