ssloy / tinyraytracer

A brief computer graphics / rendering course
https://github.com/ssloy/tinyraytracer/wiki
5.04k stars 333 forks source link

Small optimization #6

Closed nsd4fr closed 5 years ago

nsd4fr commented 5 years ago

You can optimize this even further by adding in just a couple more lines.

for (size_t i=0; i<vector.size(); i++)

evaluates vector.size() on every loop. This essentially tacks on linear search through the vector every time you increment i. Change to the following:

size_t last = vector.size(); for (size_t i=0; i<last; i++)

or even better, to not add any additional lines:

for (auto it=vector.begin(); it!=vector.end(); it++)

I understand that your goal was not to be perfectly optimal, but readable. However, this second suggestion is almost just as readable as the original, while making your program significantly faster, especially as the number of spheres/objects in the scene grows.

ssloy commented 5 years ago

Have you measured the performance changes? I tend to be prudent about such optimizations because the copiler is not dumb.

nsd4fr commented 5 years ago

I agree. The compiler is usually pretty good. And by declaring the vector as const in all your functions, you may be allowing the compiler to automatically perform the first optimization I am describing.

I ran some tests with the timer available in this repo. Specifically, timer.cpp and timer.h.

Here are my results: compiled as g++ tinyraytracer.cpp timer.cpp -O2

Old denotes using the way it currently is, for (size_t i... New denotes using the way I suggested, for (auto it = vector.begin(); it !=vector.end(); ... as well as the appropriate changes in vector access ( vector[i].method() to it->method() )

Results:

4 Spheres: Old: 1.97684s New: 1.823s

12 Spheres: Old: 4.789s New: 4.489s

28 Spheres: Old: 8.399s New: 7.586s

Discussion

I admit this is a very minor change, but this change obviously has more impact as the sphere size grows.

Attached is the edited files, with my optimizations turned on. I went ahead and attached your files as well, just for a quick download and compilation.

Files.zip

manthrax commented 5 years ago

using .size() on a vector doesn't do a linear search. the size is tracked internally. the main overhead would be that of the function call to retrieve size.. but that should be optimized/inlined by any sane compiler, no?

manthrax commented 5 years ago

If you're seeing a speed increase, I would posit that the change to access via the iterator -> is what is giving you an improvement, vs the indexed referencing. Of course the most optimal way would be to take a pointer to the first element, and the last element, and loop while ptr != lastptr. Has the advantage of not pulling in any external iterator support libraries, etc.

manthrax commented 5 years ago

Anyway.. don't want to come across negative., but what you describe runs counter to my intuition.. would be interesting to see a deeper dive into this!

r-lyeh commented 5 years ago

you can also preserve the original style by typing for (size_t i=0, end=vector.size(); i < end; i++)

nsd4fr commented 5 years ago

@manthrax Thanks for the perspective! I've recently gotten a book on optimizing code, and one of the optimizations discussed was not using container.size() type methods because they can add linear run times. I should have checked the spec for std::vector, because you are correct that it is a constant time method. It also makes sense that direct content access via the iterator is responsible for the change.