hahnyuan / PB-LLM

PB-LLM: Partially Binarized Large Language Models
MIT License
143 stars 10 forks source link

Details about the code #4

Open Hon-Chen opened 8 months ago

Hon-Chen commented 8 months ago

Really solid work! I noticed that the proportion of non-zero parameters was counted after quantization. What is the motivation for this?

quant_sequential(model, dataloader, device)
for n, p in model.named_parameters():
    print(n, torch.mean((p == 0).float()))
    if "fc2" in n:
        break
hahnyuan commented 8 months ago

Thank you for bringing up this observation. The inclusion of non-zero parameters after quantization in the code was initially implemented for debugging purposes. However, upon further evaluation, it has been determined that this code segment is no longer necessary and can be safely removed.

Hon-Chen commented 8 months ago

Thank you for bringing up this observation. The inclusion of non-zero parameters after quantization in the code was initially implemented for debugging purposes. However, upon further evaluation, it has been determined that this code segment is no longer necessary and can be safely removed.

Thanks for your reply, I also noticed that this code is unnecessary. But when I made some modifications to the mask search method, I found that the proportion of 0 values ​​has become higher. So I want to understand the meaning behind this piece of code, that is, what are the debugging purposes you mentioned here specifically? According to my understanding, the binarized weights will no longer appear "0", so is 0 brought by the salient weights?