issues
search
PaddleJitLab
/
CUDATutorial
A self-learning tutorail for CUDA High Performance Programing.
Apache License 2.0
86
stars
16
forks
source link
[Doc] Add Reduce Optimize Method: Interleaved Addressing
#7
Closed
AndSonder
closed
5 months ago
AndSonder
commented
6 months ago
Add Reduce Interleaved Addressing 交叉寻址优化
优化手段
运行时间(us)
带宽
加速比
Baseline
3118.4
42.503GB/s
~
交错寻址
1904.4
73.522GB/s
1.64
Add Reduce Interleaved Addressing 交叉寻址优化