BAAI-DCAI / Bunny

A family of lightweight multimodal models.
Apache License 2.0
865 stars 65 forks source link

`lm_head.bias=False` and missing lm_head.bias weights #19

Closed dusty-nv closed 3 months ago

dusty-nv commented 5 months ago

Hello! In this line of code, Bunny uses bias=False on the lm_head layer:

https://github.com/BAAI-DCAI/Bunny/blob/9a15192ba6b54930fd5b692f0dbb8c9f18f6d714/bunny/model/language_model/bunny_phi.py#L32

However in the original Phi code, it uses bias:

https://github.com/BAAI-DCAI/Bunny/blob/9a15192ba6b54930fd5b692f0dbb8c9f18f6d714/bunny/model/language_model/phi/modeling_phi.py#L969

I am trying Bunny-v1.0-3B through various quantization tools and faster model APIs that support Phi, but are failing due to missing this layer in the weights. They cannot easily be disabled it seems. Any suggestions on how to fix it in the model?

Isaachhh commented 5 months ago

Please try https://huggingface.co/Isaachhe/Bunny-v1_0-3B_dev.

We add a layer lm_head.bias of all-zeros for consistency with Phi-2. You can load it by editing bias=True in bunny_phi.py.

GewelsJI commented 4 months ago

Does it affect the final performance a lot?

Isaachhh commented 4 months ago

Does it affect the final performance a lot?

Bunny-v1.0-3B_dev adds a layer lm_head.bias of all-zeros, so its performance is exactly the same as Bunny-v1.0-3B.

Isaachhh commented 3 months ago

Close the issue for now if there's no further discussions. Feel free to reopen it if there's any other questions.