This PR also contains (reverted) commits from running direct execution exploration. Bottom line however is that these did not result in any noticeable performance differences.
As a result, this PR only contributes minor changes that came up during the exploration, including a convenient DispatchKernel::execute function that defaults to calling DispatchKernel::compile immediately followed by calling the resulting kernel.
This PR also contains (reverted) commits from running direct execution exploration. Bottom line however is that these did not result in any noticeable performance differences.
As a result, this PR only contributes minor changes that came up during the exploration, including a convenient
DispatchKernel::execute
function that defaults to callingDispatchKernel::compile
immediately followed by calling the resulting kernel.