bepu / bepuphysics1

Pure C# 3D real time physics simulation library. Repo contains only the 1.X.X versions.
http://www.bepuphysics.com
Apache License 2.0
402 stars 200 forks source link

Faster Way to Raycast for 5 million rays #2

Closed matangpanchal closed 6 years ago

matangpanchal commented 6 years ago

Dear Sir, I am looking for a faster way to ray cast rays with one origin and direction having a Triangle mesh in the way. in my scenario rays are almost 4 -5 millions , Raycast method of Staticmesh is giving much better performance compared to the WPF hittest method, for example rays something around 5 million takes more than 1 minute in wpf hittest method where as above mentioned raycast method does the same in 10 secs. 👍 , I tired raycast method in parallel for each loop and it completes in 3.4 seconds approx. Which is very good enough in comparision of the default one, but I am interested in getting/ knowing any other approach which can boost more performance. So please let me know if any such approach exists.

Regards, Matang

RossNordby commented 6 years ago

It's possible to do much, much better than what bepuphysics v1 exposes for ray casts. With a very high quality implementation on sufficient hardware, 500 million coherent rays per second is achievable, and billions is possible with some implementation/hardware combos.

Getting to those numbers does typically require things that aren't exactly CPUs, though. Many high performance raytracers are implemented on GPUs, and some are built with some other more exotic architectures in mind (like xeon phi).

That said, there are still massive implementation improvements compared to bepuphysics v1 available on a regular CPU. This would be composed of three big parts:

  1. Better tree quality. v1's tree builder was designed for build performance and is pretty bad across the board. As measured by SAH, its trees tend to be somewhere between 1.3 and 1.6 times worse than a simple sweep builder. (v2 uses an incrementally refining builder which can approach a sweep builder in quality, and I'm planning a revamp which should allow it to often beat an up-front sweep builder.)
  2. Better cache coherence. v1's tree is absolutely terrible for this- it's composed of heap allocated nodes padded with unnecessary data, so pretty much every step of a traversal will result in a cache miss. While this becomes less horrible in bulk ray tests against smaller datasets that fit into L2, it's still far from ideal. v2 uses a far nicer layout (and is slated to be improved significantly still).
  3. SIMD enabled batch traversals. It's possible to batch tests together to make use of simd widths greater than the 3 components of 3d bounding boxes, and this can have massive benefits when testing large numbers of rays at the same time. It both reduces the ALU time via wider operations and explicitly amortizes the cost of loading a given node from memory across many ray tests.

Taken together, I think it should be possible for v2 to reach 10 to 100 million coherent rays per second on a consumer level machine depending on the difficulty of the scene and the underlying CPU.

So, the easiest thing to do to get better performance would be 'use v2 once it's ready', but getting this kind of raytracing work in will take a few more months. If you only care about CPU raytracing performance, you might want to look at Intel's Embree, which already does everything I've mentioned very well.

matangpanchal commented 6 years ago

Dear Sir, I am very glad to receive such an extensive explanation and guidance. Thanks a lot for it.

Actually I am aware of Optix Library from Nvidia, which is very powerful GPU based library for raytracing. I am using it in one of my project also, but here (in above mentioned project) I am not able to integrate it with .net applications developed in c# or vb.net. If it is possible to use it from c# or vb.net and you are aware then I request you to share such resource.

Thanks for suggesting Embree, I will definitely explore it.

Apart from Raycast I am using BEPU for Inverse kinematics also, which is very helpful too. Thanks for making such library.

RossNordby commented 6 years ago

Depending on the target platforms, writing a C# wrapper for the C API should work (using DllImport, extern and so on), but I don't know of any existing wrappers.

matangpanchal commented 6 years ago

Ok, thank you very much