Closed xiezipeng-ML closed 2 years ago
oneflow分支:python3 -m pip install --pre oneflow -f https://staging.oneflow.info/branch/release/mt5_opt/cu112
对应的oneflow commit:2d080aa
libai分支:use_fuse_multi_head_att
在projects/T5/configs/t5_model_config.py中测量model.cfg.scale_mask_softmax_fusion = False和model.cfg.scale_mask_softmax_fusion = True上的吞吐区别
projects/T5/configs/t5_model_config.py
model.cfg.scale_mask_softmax_fusion = False
model.cfg.scale_mask_softmax_fusion = True
@ouyangyu @chengtbf
测试Fuse_Multi_Head_Attention的性能增益
oneflow分支:python3 -m pip install --pre oneflow -f https://staging.oneflow.info/branch/release/mt5_opt/cu112
对应的oneflow commit:2d080aa
libai分支:use_fuse_multi_head_att
在
projects/T5/configs/t5_model_config.py
中测量model.cfg.scale_mask_softmax_fusion = False
和model.cfg.scale_mask_softmax_fusion = True
上的吞吐区别@ouyangyu @chengtbf