Open josephcui opened 2 years ago
{ "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 2048, "num_attention_heads": 12, "attention_head_size": 64, "num_hidden_layers": 12, "vocab_size": 250112, "hidden_act": ["gelu", "linear"] }
因为mT5就是有两个hidden_act。可以认真看看博客介绍。
README 中写错了,hidden_act 这个key 写了两次,删掉第一行即可
{ "hidden_act": "gelu", "hidden_dropout_prob": 0.1, "hidden_size": 768, "initializer_range": 0.02, "intermediate_size": 2048, "num_attention_heads": 12, "attention_head_size": 64, "num_hidden_layers": 12, "vocab_size": 250112, "hidden_act": ["gelu", "linear"] }