OptiX inlines all calls to optixTrace at module compilation time. Due to OSL's lazy layer evaluation, if layer A calls layer B M times, and layer B calls layer C N times, that can lead to M*N inlines of optixTrace if layer C contains a trace operation.
In practice we've observed single trace ops being inlined hundreds of times, leading to minutes-long shader compilations.
This patch adds a new option, lazytrace, to run layers with trace ops unconditionally at the start of shader evaluation. This costs some potential performance in cases where the trace layer would never be evaluated, but removes all the compilation penalties that the inlining was presenting.
Tests
Added a new test, lazytrace, that checks via printf to make sure the non-lazy execution ordering rules were correctly followed when lazytrace=0.
[x] I have updated the documentation, if applicable.
[x] I have ensured that the change is tested somewhere in the testsuite (adding new test cases if necessary).
[x] My code follows the prevailing code style of this project. If I haven't
already run clang-format v17 before submitting, I definitely will look at
the CI test that runs clang-format and fix anything that it highlights as
being nonconforming.
OptiX inlines all calls to optixTrace at module compilation time. Due to OSL's lazy layer evaluation, if layer A calls layer B M times, and layer B calls layer C N times, that can lead to M*N inlines of optixTrace if layer C contains a trace operation.
In practice we've observed single trace ops being inlined hundreds of times, leading to minutes-long shader compilations.
This patch adds a new option, lazytrace, to run layers with trace ops unconditionally at the start of shader evaluation. This costs some potential performance in cases where the trace layer would never be evaluated, but removes all the compilation penalties that the inlining was presenting.
Tests
Added a new test, lazytrace, that checks via printf to make sure the non-lazy execution ordering rules were correctly followed when lazytrace=0.
Checklist: