brightmart / albert_zh

A LITE BERT FOR SELF-SUPERVISED LEARNING OF LANGUAGE REPRESENTATIONS, 海量中文预训练ALBERT模型
https://arxiv.org/pdf/1909.11942.pdf
3.94k stars 753 forks source link

执行预训练 pre-training on GPU/TPU using the command.训练tiny模型 #146

Open gaojingqian opened 4 years ago

gaojingqian commented 4 years ago

export BERT_BASE_DIR=./albert_tiny_zh nohup python3 run_pretraining.py --input_file=./data/tf*.tfrecord \ --output_dir=./my_new_model_path --do_train=True --do_eval=True --bert_config_file=$BERT_BASE_DIR/albert_config_tiny.json \ --train_batch_size=4096 --max_seq_length=512 --max_predictions_per_seq=51 \ --num_train_steps=125000 --num_warmup_steps=12500 --learning_rate=0.00176 \ --save_checkpoints_steps=2000 --init_checkpoint=$BERT_BASE_DIR/albert_model.ckpt 使用pretrain,采用tiny的模型init。data大小17M;中间训练过程中保存的模型data大小约50M,这中间多出的是哪些东西啊?