Closed chojw closed 3 years ago
Thanks for your attention.
Replacing q_out with q_pred in L85 in base_model.py will improve the performance from ~54 to 57. We have updated the code. Thank you for pointing out the issue.
Moreover, according to our experiments, GGE-iter is actually more stable than GGE-tog (see the results of Self-ensemble fashion). However, we have not found a reasonable explanation for this, that is why we provide both GGE-iter and GGE-tog in this paper.
If you find our method useful, welcome to cite our paper or extend GGE to other tasks.
Okay thanks! I'll be sure to try it!
Hi, thanks for sharing your code!
I just had a question about the performance of gge_tog, it never goes above 54% while gge_iter achieves the reported performance. Is there some issue with the way I ran the code?
I used this line: python main.py --dataset cpv2 --mode ggt_tog ---debias gradient --topq 1 --topv -1 --qvp 5 --output gge_tog
Thanks in advance!