Closed mdmustafizurrahman closed 4 years ago
@muelletm Anybody please can help in this regards?
Here is my list of pythons packages That I have used in Python 3.7.7
Package Version Location
absl-py 0.9.0 astor 0.8.0 beautifulsoup4 4.9.1 bert-tensorflow 1.0.1 bs4 0.0.1 certifi 2020.6.20 cloudpickle 1.5.0 decorator 4.4.2 frozendict 1.2 gast 0.3.3 google-pasta 0.2.0 grpcio 1.27.2 h5py 2.10.0 joblib 0.16.0 Keras-Applications 1.0.8 Keras-Preprocessing 1.1.0 Markdown 3.1.1 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 numpy 1.18.5 pandas 1.0.5 pip 20.1.1 protobuf 3.12.3 python-dateutil 2.8.1 pytz 2020.1 scikit-learn 0.22.2.post1 scipy 1.5.0 setuptools 49.2.0.post20200714 six 1.15.0 soupsieve 2.0.1 tapas 0.0.1.dev0 /data/t-mdra/projects/tapas tensorboard 1.14.0 tensorflow 1.14.0 tensorflow-estimator 1.14.0 tensorflow-gpu 1.14.0 tensorflow-probability 0.7.0 termcolor 1.1.0 tf-slim 1.1.0 tornado 6.0.4 Werkzeug 1.0.1 wheel 0.34.2 wrapt 1.12.1
This looks like you need to update the bert config, I think it should look like this:
{
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 768,
"initializer_range": 0.02,
"intermediate_size": 3072,
"max_position_embeddings": 1024,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"type_vocab_size": [3, 256, 256, 2, 256, 256, 10],
"vocab_size": 30522
}
What differs from the original BERT config is this line:
"type_vocab_size": [3, 256, 256, 2, 256, 256, 10],
Which specifies the dimension for the additional embeddings TAPAS is using.
@thomasmueller-google Thanks providing the config it removes the error but now I have a different error as below:
ERROR:tensorflow:Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /data/t-mdra/projects/tapas/bert_checkpoints/L-24_H-1024/model.ckpt
E0722 14:46:18.548613 140499469068096 error_handling.py:70] Error recorded from training_loop: Unsuccessful TensorSliceReader constructor: Failed to find any matching files for /data/t-mdra/projects/tapas/bert_checkpoints/L-24_H-1024/model.ckpt
INFO:tensorflow:training_loop marked as finished
I0722 14:46:18.548828 140499469068096 error_handling.py:96] training_loop marked as finished
WARNING:tensorflow:Reraising captured error
W0722 14:46:18.548956 140499469068096 error_handling.py:130] Reraising captured error
Traceback (most recent call last):
File "tapas/experiments/tapas_pretraining_experiment.py", line 151, in
@muelletm sorry I resolved the error.
I executed the following command using a BERT Base Checkpoint from (https://github.com/google-research/bert) python3 tapas/experiments/tapas_pretraining_experiment.py \ --eval_batch_size=32 \ --train_batch_size=512 \ --tpu_iterations_per_loop=5000 \ --num_eval_steps=100 \ --save_checkpoints_steps=600 \ --num_train_examples=512000000 \ --max_seq_length=128 \ --input_file_train="${data}/train.tfrecord" \ --input_file_eval="${data}/test.tfrecord" \ --init_checkpoint="${tapas_data_dir}/model.ckpt" \ --bert_config_file="${tapas_data_dir}/bert_config.json" \ --model_dir="..." \ --do_train
But I am getting the following error:
INFO:tensorflow: Features I0720 17:14:49.422954 140139063854912 tapas_pretraining_model.py:127] Features INFO:tensorflow: name = column_ids, shape = (512, 128) I0720 17:14:49.423085 140139063854912 tapas_pretraining_model.py:129] name = column_ids, shape = (512, 128) INFO:tensorflow: name = column_ranks, shape = (512, 128) I0720 17:14:49.423201 140139063854912 tapas_pretraining_model.py:129] name = column_ranks, shape = (512, 128) INFO:tensorflow: name = input_ids, shape = (512, 128) I0720 17:14:49.423346 140139063854912 tapas_pretraining_model.py:129] name = input_ids, shape = (512, 128) INFO:tensorflow: name = input_mask, shape = (512, 128) I0720 17:14:49.423449 140139063854912 tapas_pretraining_model.py:129] name = input_mask, shape = (512, 128) INFO:tensorflow: name = inv_column_ranks, shape = (512, 128) I0720 17:14:49.423546 140139063854912 tapas_pretraining_model.py:129] name = inv_column_ranks, shape = (512, 128) INFO:tensorflow: name = masked_lm_ids, shape = (512, 20) I0720 17:14:49.423643 140139063854912 tapas_pretraining_model.py:129] name = masked_lm_ids, shape = (512, 20) INFO:tensorflow: name = masked_lm_positions, shape = (512, 20) I0720 17:14:49.423738 140139063854912 tapas_pretraining_model.py:129] name = masked_lm_positions, shape = (512, 20) INFO:tensorflow: name = masked_lm_weights, shape = (512, 20) I0720 17:14:49.423831 140139063854912 tapas_pretraining_model.py:129] name = masked_lm_weights, shape = (512, 20) INFO:tensorflow: name = next_sentence_labels, shape = (512, 1) I0720 17:14:49.423925 140139063854912 tapas_pretraining_model.py:129] name = next_sentence_labels, shape = (512, 1) INFO:tensorflow: name = numeric_relations, shape = (512, 128) I0720 17:14:49.424018 140139063854912 tapas_pretraining_model.py:129] name = numeric_relations, shape = (512, 128) INFO:tensorflow: name = prev_label_ids, shape = (512, 128) I0720 17:14:49.424111 140139063854912 tapas_pretraining_model.py:129] name = prev_label_ids, shape = (512, 128) INFO:tensorflow: name = row_ids, shape = (512, 128) I0720 17:14:49.424204 140139063854912 tapas_pretraining_model.py:129] name = row_ids, shape = (512, 128) INFO:tensorflow: name = segment_ids, shape = (512, 128) I0720 17:14:49.424312 140139063854912 tapas_pretraining_model.py:129] name = segment_ids, shape = (512, 128) INFO:tensorflow:training_loop marked as finished I0720 17:14:49.433129 140139063854912 error_handling.py:115] training_loop marked as finished WARNING:tensorflow:Reraising captured error W0720 17:14:49.433277 140139063854912 error_handling.py:149] Reraising captured error Traceback (most recent call last): File "/data/t-mdra/anaconda3/envs/TAPAS_pretrain/lib/python3.7/site-packages/tensorflow/python/util/nest.py", line 378, in assert_same_structure expand_composites) ValueError: The two structures don't have the same nested structure.
First structure: type=list str=[<tf.Tensor 'IteratorGetNext:12' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:0' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:11' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:10' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:1' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:4' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:9' shape=(512, 128) dtype=int32>]
Second structure: type=int str=2
More specifically: Substructure "type=list str=[<tf.Tensor 'IteratorGetNext:12' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:0' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:11' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:10' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:1' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:4' shape=(512, 128) dtype=int32>, <tf.Tensor 'IteratorGetNext:9' shape=(512, 128) dtype=int32>]" is a sequence, while substructure "type=int str=2" is not