Closed dusty-nv closed 3 months ago
Please try https://huggingface.co/Isaachhe/Bunny-v1_0-3B_dev.
We add a layer lm_head.bias
of all-zeros for consistency with Phi-2. You can load it by editing bias=True
in bunny_phi.py
.
Does it affect the final performance a lot?
Does it affect the final performance a lot?
Bunny-v1.0-3B_dev adds a layer lm_head.bias
of all-zeros, so its performance is exactly the same as Bunny-v1.0-3B.
Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.
Hello! In this line of code, Bunny uses bias=False on the lm_head layer:
https://github.com/BAAI-DCAI/Bunny/blob/9a15192ba6b54930fd5b692f0dbb8c9f18f6d714/bunny/model/language_model/bunny_phi.py#L32
However in the original Phi code, it uses bias:
https://github.com/BAAI-DCAI/Bunny/blob/9a15192ba6b54930fd5b692f0dbb8c9f18f6d714/bunny/model/language_model/phi/modeling_phi.py#L969
I am trying Bunny-v1.0-3B through various quantization tools and faster model APIs that support Phi, but are failing due to missing this layer in the weights. They cannot easily be disabled it seems. Any suggestions on how to fix it in the model?