RenderKit / embree

Embree ray tracing kernels repository.
Apache License 2.0
2.37k stars 389 forks source link

Blockers for NVIDIA acceleration? #440

Closed dsvensson closed 1 year ago

dsvensson commented 1 year ago

Jumped into the deep end and built llvm-sycl with NVIDIA support yesterday, updated the compiler flags of embree to target nvptx64-nvidia-cuda etc, only to end up failing the build due to missing intel_get_rt_stack. Was hoping SYCL would provide all hardware abstractions. As I have no idea what I'm doing I just wanted to ask if embree lacks a big chunk of stuff for NVIDIA to work, or if it's within reach to get it working, and what would be possible steps to achieve this?

svenwoop commented 1 year ago

Embree uses an Intel extension to access the ray tracing hardware, thus you cannot just compile that code to Nvidia hardware. Thus enabling NVidia is definitely not in reach and requires significant amount of work.

dsvensson commented 1 year ago

Is the Intel extension being pushed more broadly across other vendors, so that there is some hope of this happening over the years? Where can I read more on this extension, or is it intentionally forever tied to ARC hw?

I'm also curious about performance of ARC acceleration for embree - even if you mention that it hasn't reached its full potential. Perhaps it's worth picking up an extra card. I'm currently spreading the work across 32 threads on an AMD 5950X, would a single A770 acceleration be a significant improvement?

Is any of the compatible Intel ARC hardware available in any cloud environment for trying out? The Intel Developer Cloud seems to be for big corporate rather than hobby projects unless I'm mistaken.

svenwoop commented 1 year ago

That extension will only be supported on Intel GPUs. You will see a significant speedup between a 8 core CPU and A770 when doing ray tracing. Intel Developer Cloud is the way to go to experiments with Intel Arc.

tbhunderbird commented 1 year ago

Hi Sven, could you please elaborate on what parts of embree are optimized with SYCL? Is it everything apart from this "intel_get_rt_stack" (ray-tracing extension of Intel?). And what is this ray-tracing extension of Intel? Is it also available stand alone? All the best, Tobias