Closed hemingkx closed 2 years ago
Hi,
I did not have such experience, the training time seemed to be linear. Is it because of insufficient RAM or something? Do you observe this slowing down on my code base, the official GLAT, or both?
Thanks for your early reply!
I found that this is because I installed fairseq without '--editable' option. BTW, do you know https://github.com/FLC777/GLAT is the official GLAT implementation or not?
Yes, it is.
Hi all,
Thanks very much for your awesome code!
I noticed there are some differences between your GLAT implementation and the repo here. I tried both and found that the training time cost increased rapidly during the training (for epoch1, it cost 10 min but for epoch 50, 120min). I wonder if you have encountered this in your experiments and what causes this.
Thanks very much! hemingkx