ezelikman / quiet-star

Code for Quiet-STaR
https://arxiv.org/abs/2403.09629
Apache License 2.0
660 stars 89 forks source link

warning #9

Open clarencerat opened 6 months ago

clarencerat commented 6 months ago

Some weights of the model checkpoint at ezelikman/quietstar-8-ahead were not used when initializing MistralForCausalLM: ['end_embedding', 'start_embedding', 'talk_head.0.0.bias', 'talk_head.0.0.weight', 'talk_head.0.2.bias', 'talk_head.0.2.weight', 'talk_head.0.4.weight']

noob000007 commented 2 months ago

same question