Open rongruosong opened 5 years ago
@rongruosong the model should already be available http://gluon-nlp.mxnet.io/master/model_zoo/bert/index.html why are you converting it yourself?
@leezu could you help take a look for this assertion?
@rongruosong the model should already be available http://gluon-nlp.mxnet.io/master/model_zoo/bert/index.html why are you converting it yourself?
@leezu could you help take a look for this assertion?
I want to use convert_tf_model.py to convert Chinese-BERT-wwm(Joint Laboratory of HIT and iFLYTEK ) to gluon, so I want to know whether this script works.
@rongruosong You can comment out the following:
with open(os.path.join(args.tf_checkpoint_dir, args.tf_config_name), 'r') as f:
tf_config = json.load(f)
assert len(tf_config) == len(tf_config_names_to_gluon_config_names)
for tf_name, gluon_name in tf_config_names_to_gluon_config_names.items():
if tf_name is None or gluon_name is None:
continue
assert tf_config[tf_name] == predefined_args[gluon_name]
I tried, without adding this content, you can still convert the tf model to gluon. And test the stdev = 7.2654996e-07 with compare_tf_gluon_model.py
Hi @rongruosong, the reason for the failure is that the bert_config.json
of https://github.com/ymcui/Chinese-BERT-wwm configures some hyperparameters that are currently unsupported by the BERTEncoder
API. You'd need to extend the API first.
In particular, the bert_config.json
you want to use defins
{
"attention_probs_dropout_prob": 0.1,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128
}
out of which currently only the options appearing as keys in the following dict are support
tf_config_names_to_gluon_config_names = {
'attention_probs_dropout_prob': 'embed_dropout',
'hidden_act': None,
'hidden_dropout_prob': 'dropout',
'hidden_size': 'units',
'initializer_range': None,
'intermediate_size': 'hidden_size',
'max_position_embeddings': 'max_length',
'num_attention_heads': 'num_heads',
'num_hidden_layers': 'num_layers',
'type_vocab_size': 'token_type_vocab_size',
'vocab_size': None
}
Error Message
INFO:root:converting to Gluon checkpoint ... Traceback (most recent call last): File "convert_tf_model.py", line 159, in
assert len(tf_config) == len(tf_config_names_to_gluon_config_names)
AssertionError