Closed DeclK closed 2 months ago
The draft model of EAGLE uses the embedding layer of the target model. It is not trained but is saved in the checkpoint. This checkpoint saves weights in fp32 format, with each parameter occupying 4 bytes.
That makes sense! Thanks for the reply, closing the issue.
In the readme, the EAGLE parameters is 0.24B, but in the huggingface repo, the pytorch_model.bin is 1.4GB, which is abnormal for a 0.24B size model, what is inside the .bin?