Closed beamind closed 4 years ago
According to part 3.1 of paper, the hidde_act of albert model is GELU. But according to albert_config.json file from albert_xxlarge_zh.tar.gz, the hidden_act is RELU. which is correct? Thanks!
hidde_act
hidden_act
We use RLEU for Chinese version and GELU for English eversion
According to part 3.1 of paper, the
hidde_act
of albert model is GELU. But according to albert_config.json file from albert_xxlarge_zh.tar.gz, thehidden_act
is RELU. which is correct? Thanks!