Closed CapitalZe closed 5 years ago
You can replace the init checkpoint model to point to your fined tuned Bert model
Thanks, that worked out.
As for the processor:
processor = CNNDailymail() train_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'train',"./") eval_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'eval',"./") test_dataset = get_dataset(processor,tokenizer,"./",max_seq_length_src,max_seq_length_tgt,4,'test',"./")
Should I change this to NER processor?
Yes if you have separate preprocessing code for NER
Everything has worked well since changing the preprocessor to NER.
However, I am coming stuck against the following issue at the BERT Encoder Graph:
ValueError Traceback (most recent call last)
3 embedder = tx.modules.WordEmbedder(
4 vocab_size=bert_config.vocab_size,
----> 5 hparams=bert_config.embed)
6 word_embeds = embedder(src_input_ids)
Is this to do with the values in the json file or am I mistaken?
yes check if the config file is loaded properly
post the code you are using here for encoder and config
The encoder is below:
#encoder Bert model
print("Intializing the Bert Encoder Graph")
with tf.variable_scope('bert'):
embedder = tx.modules.WordEmbedder(
vocab_size=bert_config.vocab_size,
hparams=bert_config.embed)
word_embeds = embedder(src_input_ids)
# Creates segment embeddings for each type of tokens.
segment_embedder = tx.modules.WordEmbedder(
vocab_size=bert_config.type_vocab_size,
hparams=bert_config.segment_embed)
segment_embeds = segment_embedder(src_segment_ids)
input_embeds = word_embeds + segment_embeds
# The BERT model (a TransformerEncoder)
encoder = tx.modules.TransformerEncoder(hparams=bert_config.encoder)
encoder_output = encoder(input_embeds, src_input_length)
# Builds layers for downstream classification, which is also initialized
# with BERT pre-trained checkpoint.
with tf.variable_scope("pooler"):
# Uses the projection of the 1st-step hidden vector of BERT output
# as the representation of the sentence
bert_sent_hidden = tf.squeeze(encoder_output[:, 0:1, :], axis=1)
bert_sent_output = tf.layers.dense(
bert_sent_hidden, config_downstream.hidden_dim,
activation=tf.tanh)
output = tf.layers.dropout(
bert_sent_output, rate=0.1, training=tx.global_mode_train())
print("loading the bert pretrained weights")
# Loads pretrained BERT model parameters
init_checkpoint = os.path.join(bert_finetuned_models+model, 'bert_model.ckpt')
#init_checkpoint = "gs://cloud-tpu-checkpoints/bert/uncased_L-12_H-768_A-12/bert_model.ckpt"
model_utils.init_bert_checkpoint(init_checkpoint)
All I have done here is create a new directory for my existing BERT-NER model called bert_pretrained_models and host both the finetuned and pretrained files there, running the script pointing that directory instead as well.
This addition you will see at the end of the config cell:
#config
dcoder_config = {
'dim': 768,
'num_blocks': 6,
'multihead_attention': {
'num_heads': 8,
'output_dim': 768
# See documentation for more optional hyperparameters
},
'position_embedder_hparams': {
'dim': 768
},
'initializer': {
'type': 'variance_scaling_initializer',
'kwargs': {
'scale': 1.0,
'mode': 'fan_avg',
'distribution': 'uniform',
},
},
'poswise_feedforward': tx.modules.default_transformer_poswise_net_hparams(
output_dim=768)
}
loss_label_confidence = 0.9
random_seed = 1234
beam_width = 5
alpha = 0.6
hidden_dim = 768
opt = {
'optimizer': {
'type': 'AdamOptimizer',
'kwargs': {
'beta1': 0.9,
'beta2': 0.997,
'epsilon': 1e-9
}
}
}
lr = {
'learning_rate_schedule': 'constant.linear_warmup.rsqrt_decay.rsqrt_depth',
'lr_constant': 2 * (hidden_dim ** -0.5),
'static_lr': 1e-3,
'warmup_steps': 2000,
}
bos_token_id =101
eos_token_id = 102
model_dir= "./models"
run_mode= "train_and_evaluate"
batch_size = 32
test_batch_size = 32
max_train_epoch = 20
display_steps = 100
eval_steps = 100000
max_decoding_length = 400
max_seq_length_src = 512
max_seq_length_tgt = 400
bert_pretrain_dir = '/content/bert_pretrained_models/NER3/'
bert_finetune_dir = 'bert_finetuned_models/NER3'
#config
Update:
By modifying how I download my BERT-NER fintuned model and the pretrained model (I used Large cased), I was able to bypass the troublesome cell. However, the encoder model throws back this error:
Intializing the Bert Encoder Graph
WARNING:tensorflow:From texar_repo/texar/modules/encoders/transformer_encoders.py:340: dropout (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dropout instead.
WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/layers/core.py:143: calling dropout (from tensorflow.python.ops.nn_ops) with keep_prob is deprecated and will be removed in a future version.
Instructions for updating:
Please use `rate` instead of `keep_prob`. Rate should be set to `rate = 1 - keep_prob`.
WARNING:tensorflow:From <ipython-input-16-08b32230cd7b>:28: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
loading the bert pretrained weights
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
<ipython-input-16-08b32230cd7b> in <module>()
33 print("loading the bert pretrained weights")
34 # Loads pretrained BERT model parameters
---> 35 init_checkpoint = os.path.join(bert_pretrained_models+model, 'bert_model.ckpt')
36 #init_checkpoint = "gs://cloud-tpu-checkpoints/bert/uncased_L-12_H-768_A-12/bert_model.ckpt"
37 model_utils.init_bert_checkpoint(init_checkpoint)
NameError: name 'bert_pretrained_models' is not defined
I do not understand why 'bert_pretrained_models' is being registered as not defined, when in previous cells it has been defined and files successfully saved to and called from that directory.
I really like what you've done here.
I have a BERT model fine-tuned for NER and would like to implement it using your architecture here.
My intention is to bypass the fine-tuning section where you use stories and directly use my fine-tuned model in it's place.
Do you have any tips?