ezelikman / quiet-star

Code for Quiet-STaR
https://arxiv.org/abs/2403.09629
Apache License 2.0
392 stars 57 forks source link

warning #9

Open clarencerat opened 3 months ago

clarencerat commented 3 months ago

Some weights of the model checkpoint at ezelikman/quietstar-8-ahead were not used when initializing MistralForCausalLM: ['end_embedding', 'start_embedding', 'talk_head.0.0.bias', 'talk_head.0.0.weight', 'talk_head.0.2.bias', 'talk_head.0.2.weight', 'talk_head.0.4.weight']