google-research / bert

TensorFlow code and pre-trained models for BERT
https://arxiv.org/abs/1810.04805
Apache License 2.0
37.83k stars 9.56k forks source link

How to choose num_train_step in run_pretraining.py ? #1081

Open max-yue opened 4 years ago

max-yue commented 4 years ago
python create_pretraining_data.py \
  --input_file=./chinese_sample_text.txt \
  --output_file=/tmp/tf_examples.tfrecord \
  --vocab_file=bert_checkpoint/vocab.txt \
  --do_lower_case=True \
  --max_seq_length=256 \
  --max_predictions_per_seq=38 \
  --masked_lm_prob=0.15 \
  --random_seed=12345 \
  --dupe_factor=5 \
  --do_whole_word_mask

python run_pretraining.py \
  --input_file=/tmp/tf_examples.tfrecord \
  --output_dir=/tmp/pretraining_output \
  --do_train=True \
  --do_eval=True \
  --bert_config_file=bert_checkpoint/bert_config.json \
  --init_checkpoint=bert_checkpoint/bert_model.ckpt \
  --train_batch_size=32 \
  --max_seq_length=256 \
  --max_predictions_per_seq=38 \
  --num_train_steps=20 \
  --num_warmup_steps=10 \
  --learning_rate=2e-5

Let's assume the create_pretraining_data.py script wrote N total instances, such as N=100000; How to choose num_train_step in run_pretraining with the N total instances we have?

parmarsuraj99 commented 4 years ago

You mean epochs?

maybe --iterations_per_loop could help.