Closed LifeIsStrange closed 2 years ago
Thank you for your suggestion!
send me an update if you try that approach :) There is also the topic of next gen optimizers/activation functions (mish, lookahead, gradient centralization, etc https://github.com/lessw2020/Ranger21)
For example I recommended XLnet to a researcher that had the SOTA in dependency parsing and it worked, he outperformed his previous result (BERT) and is still to this day the #1 SOTA on pen treebank
@vdobrovolskii Hi, I think you would beat the state of the art once again on coreference resolution by switching your Roberta with XLnet. Xlnet consistently outperform Roberta and often significantly.
Thanks for your hard work.