Closed Elabajaba closed 2 years ago
These changes combined give me a repeatable ~3-4% speedup over current main. There's probably another 5% available by rewriting some of the existing for loops as iter chains, but the early returns make it annoying/non trivial.
See the comments in https://github.com/svenstaro/bvh/pull/87 for the f32 asm comparison between std .min() and .max() and the included functions in this pr.
thanks! seeing the same kind of improvement on my laptop
These changes combined give me a repeatable ~3-4% speedup over current main. There's probably another 5% available by rewriting some of the existing for loops as iter chains, but the early returns make it annoying/non trivial.
See the comments in https://github.com/svenstaro/bvh/pull/87 for the f32 asm comparison between std .min() and .max() and the included functions in this pr.