Closed luraess closed 2 months ago
reopened as foreseen GPU optimizations should also make the usage of LoopVectorization feasible without or little approach divergence between CPU and GPU code generation
LoopVectorization's future is unsure; instead, code generation for Polyester has been enabled.
Something to consider as alternative or supplement to the current
Threads.@threads
option. The@tturbo
macro allows for threaded aux instruction exposed by the LoopVectorization package. See here https://github.com/luraess/parallel-gpu-workshop-JuliaCon21#parallel-cpu-implementation for an example. There may be some restrictions on handlingif
conditions inside the loop.