About results of the E2E-TOD

xiami2019 commented 3 years ago

Hi, what a nice work! I have a little question about results of the E2E_TOD fine-tuning. I noticed that the released best result is obtained at Epoch 6. However, I trained 15 epoch and only get 92.06 combined score (at epoch 10). My batch size is 128 (number_of_gpu 4, batch_size_per_gpu 2, gradient_accumulation_steps 16), which is same as the release code. I wonder that is there any other settings for E2E_TOD fine-tuning to get best result in a few epochs.

Looking forward to your reply.

yxuansu commented 3 years ago

Hello, thank you for your question!

May I ask are you using PPTOD-small or PPTOD-base?

Here are a few suggestions:

First using PPTOD-small to validate the results. (PPTOD-smaller and easier to train)
You can also change line 203 of learn.py from 'dev' to 'test' to directly observe the results on test set. Otherwise, you could run inference.sh to see the model performance on the test set. (But for results validation, changing 'dev' to 'test' should be the easiest.)
We observe randomness in the best epoch number. I suggest you to train PPTOD-small for at least 30 epochs to see the test set results. The results should be pretty easy to replicate.

Please let me know whether you can replicate the results. Looking forward to your updates!

xiami2019 commented 3 years ago

Thanks for your response, I initialized the model with PPTOD-base. I think I just record results from dev set. I will train for more epochs and update my results.

awslabs / pptod

About results of the E2E-TOD #1