执行预训练 pre-training on GPU/TPU using the command.训练tiny模型

export BERT_BASE_DIR=./albert_tiny_zh nohup python3 run_pretraining.py --input_file=./data/tf*.tfrecord \ --output_dir=./my_new_model_path --do_train=True --do_eval=True --bert_config_file=$BERT_BASE_DIR/albert_config_tiny.json \ --train_batch_size=4096 --max_seq_length=512 --max_predictions_per_seq=51 \ --num_train_steps=125000 --num_warmup_steps=12500 --learning_rate=0.00176 \ --save_checkpoints_steps=2000 --init_checkpoint=$BERT_BASE_DIR/albert_model.ckpt 使用pretrain，采用tiny的模型init。data大小17M；中间训练过程中保存的模型data大小约50M，这中间多出的是哪些东西啊？

brightmart / albert_zh

执行预训练 pre-training on GPU/TPU using the command.训练tiny模型 #146