Closed cylnlp closed 2 years ago
The commands are as below respectively.
Master:
python3 cli.py \
--method pet \
--pattern_ids 0 1 2 3 \
--data_dir ../mnli \
--model_type roberta \
--model_name_or_path roberta-large \
--task_name mnli \
--output_dir mnli-roberta-large \
--no_distillation \
--do_eval \
--pet_repetitions 1
and v1.1.0:
python3 run_training.py \
--wrapper_type mlm \
--train_examples 100 \
--data_dir ../mnli \
--model_type roberta \
--model_name_or_path roberta-large \
--task_name mnli \
--output_dir mnli-roberta-large \
--do_train \
--do_eval \
--max_steps 0 \
--repetitions 1 \
--pattern_ids 0 1 2 3
Hi @cylnlp, your results for the v1.1.0
code look really odd. Note that --max_steps
only overrides --num_train_epochs
if it is set to a value greater than 0, so you need to set --num_train_epochs 0
in the v1.1.0 example. However, training on 100 examples should actually improve performance, not make it worse. Could you still verify what happens if you set --num_train_epochs 0
in the v1.1.0
example?
Okay. Thanks @timoschick .
Hi @timoschick , I was playing with Roberta under zero-shot settings, using the commands provided by this page. I find that the performance varies very much. Using master code to experiment with MNLI, I obtained the performance:
But when using v1.1.0 code, the performance is just:
Do you have any ideas how come those results differ?
Thanks, Yulong