Open VincentXWD opened 2 months ago
Hello developers, I'm inspecting smoothquant and use the script below to check the quantized model parameter sizes:
from smoothquant.opt import Int8OPTForCausalLM from transformers.models.opt.modeling_opt import OPTForCausalLM import torch model_name = "mit-han-lab/opt-2.7b-smoothquant" model_smoothquant = Int8OPTForCausalLM.from_pretrained(model_name, device_map='auto') for name, param in model_smoothquant.named_parameters(): print(f"Parameter Name: {name}, Parameter Shape: {param.shape}")
I noticed that there are only 4 layers collected by the inner-loop.
Parameter Name: model.decoder.embed_tokens.weight, Parameter Shape: torch.Size([50272, 2560]) Parameter Name: model.decoder.embed_positions.weight, Parameter Shape: torch.Size([2050, 2560]) Parameter Name: model.decoder.final_layer_norm.weight, Parameter Shape: torch.Size([2560]) Parameter Name: model.decoder.final_layer_norm.bias, Parameter Shape: torch.Size([2560])
Could some one explain this phenomenon? Thanks!
Hello developers, I'm inspecting smoothquant and use the script below to check the quantized model parameter sizes:
I noticed that there are only 4 layers collected by the inner-loop.
Parameter Name: model.decoder.embed_tokens.weight, Parameter Shape: torch.Size([50272, 2560]) Parameter Name: model.decoder.embed_positions.weight, Parameter Shape: torch.Size([2050, 2560]) Parameter Name: model.decoder.final_layer_norm.weight, Parameter Shape: torch.Size([2560]) Parameter Name: model.decoder.final_layer_norm.bias, Parameter Shape: torch.Size([2560])
Could some one explain this phenomenon? Thanks!