Closed ooobsidian closed 3 years ago
Hi there,
Sorry I am not familiar with your downstream task and it is hard for me to answer that. A general suggestion is to tune the hyper-parameter, especially the learning rate, batch size, epochs for your task. AST generally needs a smaller learning rate than other models.
-Yuan
Hello Yuan, I sent an email to your email(yuangong@mit.edu) and hope to discuss with you about AST.