bojone / t5_in_bert4keras

整理一下在keras中使用T5模型的要点
171 stars 28 forks source link

config.json中为什么有两个hidden_act? #11

Open josephcui opened 2 years ago

josephcui commented 2 years ago

{ "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 2048, "num_attention_heads": 12, "attention_head_size": 64, "num_hidden_layers": 12, "vocab_size": 250112, "hidden_act": ["gelu", "linear"] }

bojone commented 2 years ago

因为mT5就是有两个hidden_act。可以认真看看博客介绍。

xv44586 commented 2 years ago

README 中写错了,hidden_act 这个key 写了两次,删掉第一行即可