Closed maddyscientist closed 1 year ago
This optimization helps the performance of declare_strided_gather. With the nvc compiler (NVIDIA's C compiler), this results in a doubling in performance.
declare_strided_gather
nvc
This optimization helps the performance of
declare_strided_gather
. With thenvc
compiler (NVIDIA's C compiler), this results in a doubling in performance.