Open renhouxing opened 7 months ago
There is no lm_head.weight in the parameters of Starcoder2-3B and Starcoder2-7B. Is it because of tied embedding?
Yes!
ok, thanks for your response!
There is no lm_head.weight in the parameters of Starcoder2-3B and Starcoder2-7B. Is it because of tied embedding?