Hello! Thanks for sharing your code and brilliant work.
I'd like to ask about the evaluation on test-std and test-dev. Is there any way to know the number of epochs needed for training, since evaluation is not available. I've seen that in your case, you used the same number of training epochs (13). But I assume that since the training data is largely increased (since for evaluation on test set requires training on 'train+val+vg' sets), the number of epochs required for convergence will also increase. Or do you evaluate a number of epochs on the online server to see which performs better?
Hello! Thanks for sharing your code and brilliant work. I'd like to ask about the evaluation on test-std and test-dev. Is there any way to know the number of epochs needed for training, since evaluation is not available. I've seen that in your case, you used the same number of training epochs (13). But I assume that since the training data is largely increased (since for evaluation on test set requires training on 'train+val+vg' sets), the number of epochs required for convergence will also increase. Or do you evaluate a number of epochs on the online server to see which performs better?
Thanks for sharing your suggestion.
Regards