Scale up performance comparison and GLUE task

La-SilverLand commented 3 years ago

Hi, i've got 2 questions 1) have you scaled up the training data and compare with BERT accordingly ? 2) bert and other transformer-based models often run the GLUE task suit, but your paper does not include this part, what is your consideration ?

joongbo commented 3 years ago

Hi, here are my answers:

have you scaled up the training data and compare with BERT accordingly ? ASN> No, I have not. But now I am working on scaling up the model size first. Using more training data is our next concern.
bert and other transformer-based models often run the GLUE task suit, but your paper does not include this part, what is your consideration ? ANS> This is because the main consideration of our paper is on unsupervised learning tasks such as N-best list re-ranking (but, as you know, GLUE tasks are supervised tasks).

For your information, I am going to run the GLUE task after enhancing the model. For now, 3-layer TTA is not good for fine-tuning on the downstream tasks.

Thanks for your interest, and I'd be happy to answer any more follow-up questions.

La-SilverLand commented 3 years ago

thanks, i'll keep an eye on your next step work : )

joongbo / tta

Scale up performance comparison and GLUE task #2