athas / raytracers

Performance comparison of parallel ray tracing in functional programming languages
303 stars 19 forks source link

Couple of small, quick performance tweaks to reduce allocations #7

Closed sadiqj closed 4 years ago

sadiqj commented 4 years ago

Manually defining the vector operations halves the number of allocations.

Also removed another allocation in the aabb_hit logic. There's probably still a bit to do there.

These tweaks give about a 40% speedup on my quadcore i7 4770k (when run with --cores 4).

athas commented 4 years ago

It's tragic that changes like this make a difference.

kayceesrk commented 4 years ago

Curious to see how much this improves OCaml performance @athas. OCaml has a new optimizing pass flambda, which should make many of these handwritten optimizations unnecessary.

athas commented 4 years ago

I'll post some new numbers later today. Got distracted by other computer things.