kidoman / rays

Ray tracing based language benchmarks
https://kidoman.com/programming/go-getter.html
95 stars 23 forks source link

Switch vector class to SSE instructions for a performance gain. #5

Closed leecbaker closed 11 years ago

leecbaker commented 11 years ago

I've reimplemented in the vector class using SSE4. This improves running time on my Ivy Bridge laptop by about 10-15%.

kidoman commented 11 years ago

Gotta find a way to have both this and https://github.com/kid0m4n/rays/pull/6

Also, pardon my lack of knowledge of C++, but will this still compile with: clang, g++, Intel C++ compiler?

Moving it to a separate benchmark (folder) might be worth it so that we keep a standard C++ version, and one that goes full metal jacket

leecbaker commented 11 years ago

This will compile with clang, g++, icc, visual studio, etc just fine. I'll take a deeper look at #6 later on, but at first glance it looks like most of the changes don't change performance at all, and many of them don't actually change the compiled code at all.

t-mat commented 11 years ago

Impressive ! I think it would be nice to remain the original code for non-SSE platforms (ARM, etc).

#if defined(RAYS_NO_SSE)
// original vector code
#else
#include <smmintrin.h>
// @leecbaker's SSE vector code
#endif
leecbaker commented 11 years ago

If you're going that route, you might as well use

 #ifdef __SSE4_1__

so the switch between the versions happens automatically without an external define.

kidoman commented 11 years ago

I am looking to merge #5, #6 and #7. Just want to make sure the mechanism of running the tests stay the same

kidoman commented 11 years ago

I have used a define to control which vector class is active, so that we can test both