Closed xszheng2020 closed 2 years ago
Hi @xszheng2020 , Thank you for your interest in our work and for your kind words!
Yes, in GPT2-large, a large temperature of 13 was needed.
In general, we noticed that KNN-based models tend to provide more benefit as the model is smaller:
Finding a way to make KNN-LM and RetoMaton be more useful in large models is a really interesting problem, or finding out an explanation as to why it behaves like that.
Best, Uri
Hi @urialon Thanks a lot! I adjusted the temperature and got similar results as you point out (after I posted this issue), so I closed the issue directly. Thanks again!
Hi, @urialon Thanks for your great work. I just tested
neulab/gpt2-large-finetuned-wikitext103
without and with--knn
but could not observe an improvement... ppl 10.5565 vs ppl 10.6538 Any idea about this? Should I tune the hyperparameters like temperature? Thanks.