Closed MilesCranmer closed 9 months ago
This PR adds native GPU support. This is a single CUDA kernel which evaluates an expression directly on the GPU!
TODO:
CUDA.@captured
@sync
Optim
It works by launching a CUDA kernel to evaluate many independent nodes of a tree at once, from the leafs, upwards:
It also lets you evaluate multiple trees at once – which are dispatched to the GPU during the same kernel launches!
This PR adds native GPU support. This is a single CUDA kernel which evaluates an expression directly on the GPU!
TODO:
CUDA.@captured
helps at all@sync
Optim
support now or laterIt works by launching a CUDA kernel to evaluate many independent nodes of a tree at once, from the leafs, upwards:
It also lets you evaluate multiple trees at once – which are dispatched to the GPU during the same kernel launches!