dvmazur / mixtral-offloading

Run Mixtral-8x7B models in Colab or consumer desktops
MIT License
2.28k stars 223 forks source link

Change of query weight matrices shapes #37

Closed avani17101 closed 1 month ago

avani17101 commented 1 month ago

How are query weights being changed over here? layer = 0 f"model.layers.{layer}.self_attn.q_proj.W_q" shape of above is supposed to be 4096x4096 how is it being halved for first dimension? (in your qauntised model its 2048x4096! @justheuristic @dvmazur @eltociear @lavawolfiee can you please clarify this?