sandialabs / omega_h

Simplex mesh adaptivity for HPC
Other
112 stars 54 forks source link

R3D intersection is really slow on GPUs #69

Closed ibaned closed 7 years ago

ibaned commented 7 years ago

running on my GTX 980 Ti, the difference with and without R3D intersection is 1 second versus 6 seconds total for the warp_test, and adding R3D integration raises total time to 10 seconds. I'll just note this as an issue because its not yet top priority, but at some point I should try to speed that up. Note that in serial on the CPU the warp_test runs in less than 2 seconds.

ibaned commented 7 years ago

I fixed this (although kindof in a rush) in the commits merged at 3c74110. Interestingly, the compiler was not optimizing out pass-by-value of the Polytope class, so switching to references was actually what I did.