Closed yehuitang closed 8 months ago
Thanks for you interesting work! What is the value of weight decay when training the model. I note that it is 0.1 in the github (https://github.com/EleutherAI/pythia/blob/afec006b936a38131fc6c6d2b0a48425dd97bc6e/models/1B/pythia-1b.yml) but 0.01 in the paper (Table 6 in Appendix E). https://arxiv.org/pdf/2304.01373.pdf
Thanks!
Hi, this is an error in the paper, apologies--we used 0.1 for weight decay, matching the config files in this repo.
Thanks for you interesting work! What is the value of weight decay when training the model. I note that it is 0.1 in the github (https://github.com/EleutherAI/pythia/blob/afec006b936a38131fc6c6d2b0a48425dd97bc6e/models/1B/pythia-1b.yml) but 0.01 in the paper (Table 6 in Appendix E). https://arxiv.org/pdf/2304.01373.pdf
Thanks!