Open bjoernpl opened 7 months ago
The 160
comes from
Since there is no head_dim
in https://huggingface.co/stabilityai/stablelm-2-12b/blob/main/config.json, it's calculated by hidden_size // num_attention_neads = 5120 // 32 = 160
.
Looking at modeling_stablelm.py
in transformers
this appears to be the correct calculation
@WoosukKwon will this head size remain unsupported by PagedAttention
?
@WoosukKwon will this head size remain unsupported by
PagedAttention
?
I also met this problem. Hope this could be solved, considering the importance of stableLM.
Does this has any progress? https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407 also met this problem.
This issue has been automatically marked as stale because it has not had any activity within 90 days. It will be automatically closed if no further activity occurs within 30 days. Leave a comment if you feel this issue should remain open. Thank you!
Your current environment
🐛 Describe the bug
Running vLLM with the new StableLM model
stabilityai/stablelm-2-12b
leads to this error regarding head size.