Open Vincent-Ww opened 2 years ago
是的,你说得对,我这个脚本只是提供相关参数,然后传递这些参数可以跑通,用的数据不一定是正确的。
2022年9月22日 下午8:48,Vincent-Ww @.***> 写道:
CUDA_VISIBLE_DEVICES=2,3 python general_distill.py \ --teacher_model /nas/pretrain-bert/pretrain-pytorch/chinese_wwm_ext_pytorch/ \ --student_model student_model/ \ --train_file_path /nas/lishengping/datas/tiny_task_data/train.txt \ --do_lower_case \ --train_batch_size 20 \ --output_dir ./output_dir \ --learning_rate 5e-5 \ --num_train_epochs 3 \ --eval_step 5000 \ --max_seq_len 128 \ --gradient_accumulation_steps 1 3>&2 2>&1 1>&3 | tee logs/tiny_bert.log 请问generall distillation为什么也用了task_data。
— Reply to this email directly, view it on GitHub https://github.com/Lisennlp/TinyBert/issues/11, or unsubscribe https://github.com/notifications/unsubscribe-auth/AKIGXIE6R2NUJ3AG4ACRHGLV7RISNANCNFSM6AAAAAAQTAURJM. You are receiving this because you are subscribed to this thread.
第四行,请问generall distillation为什么也用了task_data。