pytorch / tvm

TVM integration into PyTorch
452 stars 64 forks source link

Optimized sched for layer norm #119

Closed kimishpatel closed 5 years ago

kimishpatel commented 5 years ago

Added custom schedule to parallel reduction for max and stddev calculation.

kimishpatel commented 5 years ago

@yinghai here are the results. {'[20, 1, 1024][1024][1024]': {'JIT': 0.07628417015075684, 'TVM': 0.042891740798950195}, '[21, 1, 1024][1024][1024]': {'JIT': 0.07934427261352539, 'TVM': 0.046388864517211914}, '[22, 1, 1024][1024][1024]': {'JIT': 0.08039498329162598, 'TVM': 0.0460963249206543}, '[23, 1, 1024][1024][1024]': {'JIT': 0.08323144912719727, 'TVM': 0.04950523376464844}, '[24, 1, 1024][1024][1024]': {'JIT': 0.08355355262756348, 'TVM': 0.05075263977050781}, '[25, 1, 1024][1024][1024]': {'JIT': 0.08996939659118652, 'TVM': 0.053050994873046875}, '[26, 1, 1024][1024][1024]': {'JIT': 0.09070277214050293, 'TVM': 0.054137468338012695}}

kimishpatel commented 5 years ago

@yinghai, can you approve if it looks ok?