hahnyuan / PTQ4ViT

Post-Training Quantization for Vision transformers.
192 stars 27 forks source link

Shape of saved quantized model parameter #13

Open movingsheep opened 1 year ago

movingsheep commented 1 year ago

Hi, thanks for sharing the work!

I meet a problem when trying to load the "vit_base_patch16_224.pth". The shape of 'blocks.0.attn.qkv' in pth file is torch.Size([3, 1, 2304, 768]). However, the shape of 'blocks.0.attn.qkv.weight' in model should be torch.Size([2304, 768]). What does the first and second dimension in torch.Size([3, 1, 2304, 768]) mean? I think it should be torch.Size([2304, 768]).

SuperVan-Young commented 1 year ago

We import our ViT model from package timm, and this is how they store their weight tensor. Indeed, W_Q, W_K and W_V should be [2304, 768], but timm fuses the weight of W_Q, W_K and W_V into a single tensor, so that they can perform 3 linear transformations with 1 linear layer. You can check out their code for more information.

从 Windows 版邮件发送

发件人: movingsheep 发送时间: 2023年4月4日 20:27 收件人: hahnyuan/PTQ4ViT 抄送: Subscribed 主题: [hahnyuan/PTQ4ViT] Shape of saved quantized model parameter (Issue#13)

Hi, thanks for sharing the work! I meet a problem when trying to load the "vit_base_patch16_224.pth". The shape of 'blocks.0.attn.qkv' in pth file is torch.Size([3, 1, 2304, 768]). However, the shape of 'blocks.0.attn.qkv.weight' in model should be torch.Size([2304, 768]). What does the first and second dimension in torch.Size([3, 1, 2304, 768]) mean? I think it should be torch.Size([2304, 768]). — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>