Closed YC-wind closed 3 years ago
Thanks for your interest. Yes, but we will test TTA on GLUE benchmark AFTER enhancing the model. we observed that 12-layer TTA perform better than 3-layer TTA in reranking tasks (as well as unsupervised semantic textual similarity tasks).
Thanks for your excellent work, will you plan to test TTA's performance on glue benchmark