Closed DreamMr closed 5 months ago
Originally, I was not intending its sharing weight, but the trained weight, as you know, was saved in sharing them. Therefore, the result is just the way they are but it can be further updated in the future for the further improved version. Thank you for examining it!
Nice work
https://github.com/ByungKwanLee/TroL/blob/7a71dfb9acf1f5339ce29c30f9e2b07faa7757ca/trol/arch_internlm2/modeling_internlm2.py#L875
Hi~, I found that the values of the trol_gating weight are the same in 'trol_gating.pt'. Is that right?