Open chunhuizng opened 5 months ago
Hi,
It is the same as GPT-J-6B, which you can refer to Appendix A.4.
Hi,
It is the same as GPT-J-6B, which you can refer to Appendix A.4.
Thank you very much and this is helpful!
By the way, would you like to release the fisher matrix of the LLaMA fine-tuning?
Best, Chunhui
Hi, It is the same as GPT-J-6B, which you can refer to Appendix A.4.
Thank you very much and this is helpful!
By the way, would you like to release the fisher matrix of the LLaMA fine-tuning?
Best, Chunhui
As the Fisher matrix of LLaMA is required to reproduce the LLaMA’s EWC fine-tuning in this paper. Thanks!
Sorry it's been a while and I cannot find the Fisher Matrix of LLaMA at the current point... But I uploaded the script for computing fisher matrix, so you can try computing it yourself.
May I ask for the hyperparameters used for LLaMA finetuning? The learning rate, batch size, EWC coefficient (λ), and the rank and coefficient of LoRA will be helpful.
Thank you!