Open Yejing-Lai opened 7 months ago
This PR aims to apply the apply_layernorm_1p flag. When set to True, we need to do layernorm.weigth + 1.
Hi @tjruwase, Please kindly review~ This PR will fix the non-cuda accelerator layernorm accuracy issue. Thanks!
This PR aims to apply the apply_layernorm_1p flag. When set to True, we need to do layernorm.weigth + 1.